Creating predictions on datasets

Use your ML deployment to predict future outcomes on new data.

To start creating prediction configurations, open an ML deployment and go to the Dataset predictions pane. See Navigating the ML deployment interface.

You can make predictions on datasets in the catalog, for example, daily predictions on new transactions. Predictions can also be made in real time using the real-time prediction endpoint in the Machine Learning API. For information about real-time predictions, see Creating real-time predictions.

The real-time predictions API is deprecated and replaced by the real-time prediction endpoint in the Machine Learning API. The functionality itself is not being deprecated. For future real-time predictions, use the real-time prediction endpoint in the Machine Learning API.

The predictions are generated in a dataset with predictions and—for classification models—a column with the probability of each class. Optionally, you can also generate datasets with SHAP values or errors, and a copy of the apply dataset. The datasets can be in Parquet, CSV, or QVD format.

When predictions are generated, you can load the predictive insights into a Qlik Sense app. This lets you visualize and interact with the data and create what-if scenarios.

Before you start

Before you can start generating predictions with your ML deployment, the source model needs to be activated. For more information, see Approving deployed models.

Key concepts

Apply dataset

During experiment training, you deploy a model that is used to generate predictions on a new dataset. This dataset is known as the apply dataset. The predictions are generated in a dataset with predictions and—for classification models—a column with the probability of each class. Optionally, you can also generate datasets with SHAP values or errors.

Any flat file that can be uploaded and profiled in Qlik Cloud is supported for use in Qlik AutoML.

For multi-table files such as Microsoft Excel files with multiple sheets, only the first table will be imported. If data profiling fails for a table (for example, if it is empty), the file is not supported.

The apply dataset must have the same features and data types as the dataset used to train the ML deployment. The target column specified in the ML experiment does not need to be included in the apply dataset. Note that additional columns that were not part of the model training can still be present in the apply dataset. AutoML will simply ignore the additional columns when generating predictions.

Automatic feature engineering

For information about generating predictions with models that were trained using automatic feature engineering, see Automatic feature engineering.

Prediction configuration

Prediction datasets are generated from a prediction configuration. Each ML deployment can have multiple prediction configurations. The prediction configuration can be set to run with or without a schedule.

Prediction configuration ownership

When a user creates a prediction configuration, they are automatically assigned as the owner.

The following list notes the access requirements for a prediction configuration to run. If the prediction is run manually, the user running the predictions must meet the requirements. For scheduled predictions, the owner of the prediction configuration must meet the requirements.

Professional or Full User entitlement and the Automl Deployment Contributor role in the tenant. See: Who can work with Qlik AutoML
The required permissions in the space to run predictions from the ML deployment.
The required permissions to create data sources in the space where the prediction data is being saved.

It could happen that the owner of a prediction configuration loses access to the tenant, or no longer meets the other requirements for working with ML deployments. In this case, a user with the required permissions can click Make me the owner to take ownership of the scheduled prediction so that it can run. This is done in the prediction configuration pane, or as an action in the Dataset predictions window.

For information about the space permissions requirements for any of the actions mentioned in this section, see Managing permissions in shared spaces and Managing permissions in managed spaces.

Considerations for apply datasets

Impact of manually changing feature type

When you manually change the feature type of a feature, and then deploy a resulting model, the feature type overrides will be applied to the feature in the apply dataset that is used in predictions made with that model.

Changing feature types

Requirements and permissions

To learn about the permission requirements for working with ML deployments and predictions, see Working with ML predictions.

Creating new predictions

You can create new prediction configurations from both the Deployment overview pane and the Dataset predictions pane.

Do the following:

Open an ML deployment from the catalog.
In the bottom right, click Create prediction.
In the Prediction configuration pane, expand Apply data and click Select apply dataset.
Select a dataset to generate predictions for. The dataset must have the same features and data types as the Model schema.

Datasets can be uploaded via the Create page of the Analytics activity center. You can also upload a new dataset directly into Catalog from the prediction's dataset selection page. This dataset will then automatically be selected as the apply dataset for use in your prediction. To do so, click Add apply dataset and choose the file to upload.
Under Prediction dataset, click Name prediction dataset.
Enter a name (or accept the default name).

Qlik AutoML supports dynamic file naming for prediction datasets. For more information, see Using variables in prediction dataset file names.
Select a format for the generated datasets. The default is Parquet. Datasets can also be generated in CSV or QVD format.
Select a space.
Click Confirm.
Under Prediction options, select any additional datasets that you want to generate.
- Errors dataset: Generate a dataset with errors for records in the apply dataset. This lets you know if a record was dropped and for what reason.
- SHAP: Generate a dataset with SHAP values for each record. The dataset has the columns index and <feature>_SHAP for each feature in the model.
  
  Information noteThis option is not available for predictions from multiclass classification models. For these models, you can use the Coordinate SHAP option instead.
- Coordinate SHAP: Generate a dataset with SHAP values for each record. This gives you the same values as the SHAP dataset but organized in a different way. The dataset has the columns index, automl_feature, and SHAP_value. An additional column, Predicted_class, is included with predictions from a multiclass classification model.
Choose whether to autogenerate an index column or use an existing column in the apply dataset.
You might also like to run your prediction on a schedule. Under Prediction schedule, click Create schedule and adjust the settings in the dialog that appears. For more information, see Scheduling predictions.
Click the Save and close button to save your prediction configuration and return to the Dataset predictions pane without running the prediction. You might prefer this option if you only want the predictions to run on a schedule.

Alternatively, click Save and predict now to save the prediction configuration and manually run the prediction.

When Last status shows "Success", the predictions are finished.
Go to Catalog to see the generated datasets.

Editing prediction configurations

You can edit existing prediction configurations from the Dataset predictions pane.

Do the following:

In the Dataset predictions pane, click ... on the prediction configuration you want to edit.
Select Edit prediction configuration from the Actions menu.
In the Prediction configuration pane, you can edit the following sections:
- Apply data: You can change the apply dataset.
- Prediction dataset: You can change the name and space of the prediction dataset.
- Prediction options: You can change your selections for the additional datasets that are generated.
- Prediction schedule: If you wish, you can set the schedule on which your prediction will be run. For more information, see Scheduling predictions.
Click the Save and close button to save your prediction configuration and return to the Dataset predictions pane without running the prediction.

Alternatively, click Save and predict now to save the prediction configuration and manually run the prediction.

When Last status shows "Success", the predictions are finished.

Running predictions

You can run predictions for existing prediction configurations from the Dataset predictions pane. Alternatively, you might want to run your predictions according to a customizable schedule. You can combine manual and scheduled runs of your predictions to best suit your needs.

Running predictions manually

You can start running a prediction configuration directly by selecting the option within a context menu in the Dataset predictions pane.

For a user to run a prediction manually, that user must meet the access requirements for the action. See Prediction configuration ownership.

Do the following:

In the Dataset predictions pane, click ... on the prediction configuration you want to run predictions for.
Select Run predictions now from the Actions menu to start generating predictions.

When Last status shows "Success", the predictions are finished.

Scheduling predictions

Predictions can be set to run automatically on a schedule. You can create one schedule for each prediction configuration that you create. Access the Prediction schedule menu when creating or editing a prediction configuration.

For a scheduled prediction to run successfully, the owner of the prediction configuration ownership must meet several permission requirements. Otherwise, the prediction cannot run. For more information, see Prediction configuration ownership.

The Prediction schedule dialog allows you to specify the following parameters for your schedule:

Run predictions: Adjust the general schedule on which the prediction will run (daily, weekly, or monthly). Set the interval, day of the week, or day of the month depending on your selection.
Time: Configure the time of day at which your prediction will start running.

If you are scheduling by the hour (for daily or weekly predictions), you will also be able to specify a start and end time between which the predictions will run.
Start date: Set the date on which the prediction schedule takes effect.
End date: Set the date on which the predictions will stop being run on the schedule. By default, the schedule will be set to continue running indefinitely, but you can specify an end date for the schedule.
Only run if apply dataset has changed: If there has been no change in your apply dataset since the last prediction was run, a scheduled prediction will not run. You can toggle this setting off if you want to always run the scheduled prediction regardless of changes in the data.

Deleting prediction configurations

You can delete existing prediction configurations from the Dataset predictions pane.

Do the following:

In the Dataset predictions pane, click ... on the prediction configuration you want to delete.
Select Delete prediction configuration from the Actions menu.
Click Delete to confirm.

Managing prediction jobs

Tenant admins can stop or cancel prediction jobs from the Administration activity center. For more information, see Administering Qlik AutoML.

Configuring notifications

You can receive notifications when predictions are created from an ML deployment. For more information, see Configuring notifications for Qlik AutoML.

Viewing data drift and prediction event details

After you run a prediction, switch to the Data drift monitoring pane to view details about the following:

The level of data drift for each feature in the apply dataset. The comparison is performed between your apply dataset and the training dataset.
Details about the prediction event, such as whether it succeeded or failed, and how many predictions it generated.

For more information, see Monitoring performance and usage of deployed models.

Related learning:

Learn more

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here