Creating predictions on datasets
Use your ML deployment to predict future outcomes on new data.
You can make predictions on datasets in Catalog, for example, daily predictions on new transactions. Predictions can also be made in real time using the prediction API. For information about the prediction API, see Creating real-time predictions.
When predictions are generated, you can load the predictive insights into a Qlik Sense app. This lets you visualize and interact with the data and create what-if scenarios.
The dataset that you want to make predictions on is known as the apply dataset. The predictions are generated in a dataset with predictions and—for classification models—a column with the probability of each class. Optionally, you can also generate datasets with SHAP values or errors.
The apply dataset must have the same features and data types as the dataset used to train the ML deployment. The target column does not need to be included because it is created automatically when predictions are generated. Similarly, auto-engineered features created during experiment training do not need to be included in the apply dataset. However, the apply dataset must still include the parent feature that was used to generate the auto-engineered features.
Note that additional columns that were not part of the model training can still be present in the apply dataset. AutoML will simply ignore the columns when generating predictions.
Any flat file that can be uploaded and profiled in Qlik Cloud is supported for use in Qlik AutoML.
For multi-table files such as Microsoft Excel files with multiple sheets, only the first table will be imported. If data profiling fails for a table (for example, if it is empty), the file is not supported.
When a user creates a prediction configuration, they are automatically assigned as the owner of that prediction configuration.
You can run predictions for ML deployments in shared and managed spaces if you have the Owner, Can edit (shared spaces only), or Can manage roles in the space where the ML deployment exists, as well as the space where the prediction datasets are being created.
In the case of scheduled predictions, the owner of the prediction configuration must have Owner, Can edit (shared spaces only), or Can manage permissions in the space where the ML deployment exists, as well as the space where the prediction datasets are being created. For more information, see Scheduling predictions.
Creating new predictions
You can create new predictions from both the Deployment overview pane and the Dataset predictions pane.
Do the following:
-
Open an ML deployment from Catalog.
-
In the bottom right, click Create prediction.
-
In the Prediction configuration pane, expand Apply data and click Select apply dataset.
-
Select a dataset to generate predictions for. The dataset must have the same features and data types as the Model schema.
Datasets can be uploaded via the hub. You can also upload a new dataset directly into Catalog from the prediction's dataset selection page. This dataset will then automatically be selected as the apply dataset for use in your prediction. To do so, click Add apply dataset and choose the file to upload.
-
Under Prediction dataset, click Name prediction dataset.
-
Enter a name (or accept the default name) and select a space.
Qlik AutoML supports dynamic file naming for prediction datasets. For more information, see Using variables in prediction dataset file names.
-
Select a space.
Information noteYou must have the Private Analytics Content Creator role to create datasets in your personal space. -
Click Confirm.
-
Under Prediction options, select any additional datasets that you want to generate.
-
Errors dataset: Generate a dataset with errors for records in the apply dataset. This lets you know if a record was dropped and for what reason.
-
SHAP: Generate a dataset with SHAP values for each record. The dataset has the columns index and <feature>_shap for each feature in the model.
-
Coordinate SHAP: Generate a dataset with SHAP values for each record. This gives you the same values as the SHAP dataset but organized in a different way. The dataset has the columns index, feature, and shap_value.
-
-
Choose whether to autogenerate an index column or use an existing column in the apply dataset.
-
You might also like to run your prediction on a schedule. Under Prediction schedule, click Create schedule and adjust the settings in the dialog that appears. For more information, see Scheduling predictions.
-
Click the Save and close button to save your prediction configuration and return to the Dataset predictions pane without running the prediction. You might prefer this option if you only want the predictions to run on a schedule.
Alternatively, click Save and predict now to save the prediction configuration and manually run the prediction.
When Last status shows "Success", the predictions are finished.
-
Go to Catalog to see the generated datasets.
Editing prediction configurations
You can edit existing prediction configurations from the Dataset predictions pane.
Do the following:
-
In the Dataset predictions pane, click ... on the prediction configuration you want to edit.
-
Select Edit prediction configuration from the Actions menu.
-
In the Prediction configuration pane, you can edit the following sections:
-
Apply data: You can change the apply dataset.
-
Prediction dataset: You can change the name and space of the prediction dataset.
-
Prediction options: You can change your selections for the additional datasets that are generated.
-
Prediction schedule: If you wish, you can set the schedule on which your prediction will be run. For more information, see Scheduling predictions.
-
-
Click the Save and close button to save your prediction configuration and return to the Dataset predictions pane without running the prediction.
Alternatively, click Save and predict now to save the prediction configuration and manually run the prediction.
When Last status shows "Success", the predictions are finished.
Running predictions
You can run predictions for existing prediction configurations from the Dataset predictions pane. Alternatively, you might want to run your predictions according to a customizable schedule. You can combine manual and scheduled runs of your predictions to best suit your needs.
Running predictions manually
You can start running a prediction configuration directly by selecting the option within a context menu in the Dataset predictions pane.
Do the following:
-
In the Dataset predictions pane, click ... on the prediction configuration you want to run predictions for.
-
Select Run predictions now from the Actions menu to start generating predictions.
When Last status shows "Success", the predictions are finished.
Scheduling predictions
Predictions can be set to run automatically on a schedule. You can create one schedule for each prediction configuration that you create. Access the Prediction schedule menu when creating or editing a prediction configuration.
The Prediction schedule dialog allows you to specify the following parameters for your schedule:
-
Run predictions: Adjust the general schedule on which the prediction will run (daily, weekly, or monthly). Set the interval, day of the week, or day of the month depending on your selection.
-
Time: Configure the time of day at which your prediction will start running.
If you are scheduling by the hour (for daily or weekly predictions), you will also be able to specify a start and end time between which the predictions will run.
-
Start date: Set the date on which the prediction schedule takes effect.
-
End date: Set the date on which the predictions will stop being run on the schedule. By default, the schedule will be set to continue running indefinitely, but you can specify an end date for the schedule.
-
Only run if apply dataset has changed: If there has been no change in your apply dataset since the last prediction was run, a scheduled prediction will not run. You can toggle this setting off if you want to always run the scheduled prediction regardless of changes in the data.
For a prediction to run on a schedule, the owner of that prediction configuration needs to have one of the following roles in both the space where the ML deployment exists and the space where the prediction datasets are being created:
-
Owner
-
Can edit (shared spaces only)
-
Can manage
Deleting prediction configurations
You can delete existing prediction configurations from the Dataset predictions pane.
Do the following:
-
In the Dataset predictions pane, click ... on the prediction configuration you want to delete.
-
Select Delete prediction configuration from the Actions menu.
-
Click Delete to confirm.
Managing prediction jobs
Tenant admins can stop or cancel prediction jobs from the Management Console. For more information, see Managing experiments and ML deployments.
Configuring notifications
You can receive notifications when predictions are created from an ML deployment. For more information, see Configuring notifications for Qlik AutoML.