Create a new ML use case¶
When you reach the cleaning step and succeeded into having a well cleaned and structured dataset, the logical next step is apply it to a Machine Learning operation. This AutoML module is a collection of one or multiple ML use cases including multiple experiments. Each experiment represent not only an ML model but also some data processing operations, necessary for the model training. Through the AutoML module, it allows you to compare the performances of each experiment and pick out what suits you best to your needs.
However, before going into the model training, creating a new AutoML use case is a must. To create an AutoML operation, you need to click on your prepared dataset and select the yellow ML gear icon on the left sidebar on your project's Flow.
It will get you the access to the AutoML interface and specifically the use case list where you create your own use case that you need.
To create your own use case, just select the Create a new use case, either on the top right corner or in the middle of the screen (in case that is your first use case).
A wizard will be displayed to create interactively your own use case in just 3 easy steps :
First step: Select the type of use case you want to resolve from the different options available.
The list of available use case types is the following :
Binary Classification, Multi Classification, Regression: to predict a value from a target column that needs to be indicated.
Clustering: to detect groups based on some characteristics that the model will automatically find with no target specified.
CQR (Conformal Quantile Regression): This specific state-of-the-art type of Regression includes an advanced uncertainty estimation.
TS Forecasting: to predict future continuous numerical values like a Regression but related with previously observed values over a period of time.
Survival models: a specific type of regression used to estimate the duration time of an event to be occured(such as the death of a patient or an error status of a machine)
List of available ML tasks for your use case
For each type of model, a target column is required except for Clustering.
For both Classification types, the platform expect either categorical data or discrete integer values as a target class.
For Regression, the platform expect numerical data as target.
Except Clustering and TS Forecasting, some of the available model types require other settings to be indicated by the user :
- For both Classification types and Regression, the transformer applied on the target is required with no transformer option by default
- For CQR, with the target class, you select the desired confidence bounds you want to apply to your prediction
- For Survival models, you need to indicate the survival duration column with the positive outcome and the event status column.
Choose the target and its transformer step
Final step: Type out the name (required), description and tags (optional) of your new use case to finally create it.
Use case settings step
With the use case created, you get to access to the experiment interface where you test out multiple models and compare them which we are going to talk about next.