Using a Time-Series
In addressing Time-Series specific use cases, it is imperative to adopt a distinct approach for accurate analysis and effective training of Time Series (TS) models. This section will explore three specialized Time-Series tools designed to facilitate a comprehensive understanding and proficiency when dealing with Time-series datasets or use cases in a broader context. By leveraging these tools, users can gain valuable insights and enhance their expertise in navigating the intricacies of Time-Series analysis and modeling.
Time-Series Analysis¶
The TS Analysis module provides a wealth of insights and graphical representations to aid in drawing meaningful conclusions from your data, setting the stage for the subsequent construction of a Machine Learning (ML) pipeline.
To initiate the process, click on the dataset earmarked for your TS analysis and select the ML icon in the left sidebar. This action will seamlessly transition you to the ML Lab, where you can meticulously construct your ML pipeline, experimenting with various models to pinpoint the one best suited for your specific use case.
Within the ML Lab, creating your use case is a straightforward process. Simply click the Create use case button and opt for the Blank use case selection in the ensuing pop-up.
The process unfolds across three steps: firstly, specify the ML Task (in our case, TS Forecasting) and click Next; secondly, define the target class, date column, and other TS-related parameters; and finally, select a name for your use case.
Upon completing the use case setup, proceed to the TS sanity check, where papAI autonomously detects the time-series frequency and verifies correct resampling. In our case, the check passes, allowing the creation of the ML pipeline.
Warning
If the sanity check is failed due to an inconsistency of the detected frequency, you can resample your dataset by clicking the button Reample your dataset to access to the TS cleaning interface and apply your new resampling according to your likings.
Now, onto the TS Analysis section, where results are automatically computed and displayed, providing a comprehensive overview of the use case and time-series.
The first analytical tool at your disposal is the data visualization of the time-series, offering customization options for a tailored exploration.
Delve into statistics related to the target and time-series, including minimum and maximum values of the target class, as well as specific TS indicators such as periodicity and stationarity.
Another indispensable tool is the ACF and PACF plots, enabling you to determine the number of lags and gain valuable insights into stationarity, Auto Regression, and Moving Average parameters.
Concluding the suite of analysis tools is the STL decomposition, offering a detailed breakdown of factors related to the time-series, including the trend, seasonality, and residuals — a crucial step in unraveling the underlying dynamics of your data.
Here is a vido showcasing the TS Analysis module
Time-Series Forecasting¶
After meticulously creating your use case and thoroughly analyzing your time-series, the next step involves crafting your Machine Learning pipeline. This journey commences within the TS Analysis interface, where you navigate to the Experiments tab in the top right corner and initiate the process by selecting the Create Experiment button.
A pop-up window unfolds, guiding you through distinct steps to construct your pipeline and experiment with various models:
- Forecasting Window Setup:
Begin by determining the forecasting window, specifying the number of past data points for training and the desired number of forecasted data points.
- Model Selection:
Proceed to select the type of model for your experiment. In this instance, we opt for the Simple Exponential Smoothing model. You can toggle on the button next to the model and fine-tune specific parameters if necessary. Additionally, you have the flexibility to select and train multiple models simultaneously.
- Evaluation:
In the final step, define the starting point of the time-series training window. Once set, click the Create button and initiate the experiment by selecting Create experiment and train it now. The process launches instantly, and the training status, along with relevant metrics, is displayed. The training concludes successfully when the status is marked as Success.
Post-training assessment, you can add the model to your Flow by creating a model registry. This is essential for making predictions on other datasets within the flow and deploying the model into production.
To achieve this, select the three dots on the right of the model run, then click Add model to the flow. Complete the required details for the recipe and registry, and click Submit. Returning to the Flow, you'll find the model registry, containing the desired model from the ML Lab, ready for use.
Now, let's apply a prediction operation using this new model and a dataset from the flow. Simply select the new model registry and click on the prediction operation. Add an input dataset and an output dataset on the left sidebar, then click Continue. The subsequent interface provides information about the model and modifiable parameters. Here, specify the prediction length and initiate the process by selecting Save recipe and save it now under the Save button. Monitor the process through the green check next to the output dataset.
To inspect real and predicted values, access the created dataset and navigate to the Visualization tab. Utilize the data visualization module to craft a line plot showcasing the predicted values from the model. Select Line as the plot type, designate the datetime column as the X-axis, choose the target class, and assign the prediction tag for color-coding real and predicted values.
The resultant plot vividly illustrates the model's ability to replicate the behavior of the trained time-series, depicting a discernible upward trend over the years in this example.
Here is a video showcasing the TS Forecasting tool
Time-Series Anomaly Detection¶
This feature proves highly beneficial as its primary objective is to distinguish between regular expected behavior and irregular unexpected behavior within the data. Unsupervised algorithms employed in this process analyze temporal patterns and trends to establish a baseline of normal behavior, flagging any deviation from this baseline as an anomaly.
To initiate the anomaly detection process, simply select a time-series dataset from your flow and click on the Anomaly Detection icon in the left sidebar.
Within the same area, set up input and output datasets by clicking on the plus logo and then on Continue to proceed.
This action opens a new window where you can configure your model and define the targets. Select the target column to apply your anomaly detection and choose your desired model from a comprehensive catalog of state-of-the-art models.
In this instance, we opt for the Seasonal anomaly detector. Clicking on the selected target column to unveils additional parameters related to model configuration.
Once configuration is complete, select the Create recipe and run it now option within the Create button to launch the process.
As the process initiates, await the completion of the output dataset. To visualize the results, access the visualization tab by double-clicking on your dataset. Here, create a new line plot by selecting the line plot type, with the date column as the X-axis and the anomaly target column as the Y-axis. Leveraging the color axis option, anomalies within the series are highlighted in red, while normal data points remain blue.
For instance, data points in March 2020 and March 2021 are flagged as anomalies due to their significant deviation from the norm.
Here is a video showcasing the TS anomaly detection