Databricks MLflow analytics source
Databricks MLflow is a machine-learning platform for automating, assuring, and accelerating predictive analytics, helping data scientists and analysts to build and deploy accurate predictive models.
To connect to Databricks MLflow, you must have created, or have access to, a model and deployed it to an endpoint on the Databricks MLflow platform. Additionally, this endpoint must be publicly accessible by Qlik Cloud.
Limitations
-
Databricks MLflow has an endpoint quota. For more information, see Introduction to Databricks Machine Learning.
-
The resources available on the Databricks MLflow services where the model has been deployed impact and limit performance in the Qlik Sense reload and chart responsiveness.
-
The Databricks MLflow connector is limited to 200,000 rows per request. These are sent to the endpoint service in batches of 2,000 rows. In scenarios where more rows must be processed, use a Loop within the Data load script to process more rows in batches.
-
When an application is regularly reloaded, it's best practice to cache the predictions using a QVD file and only send the new rows to the prediction endpoint. This will improve the performance of the Qlik Sense application reload and reduce the load on the Databricks MLflow endpoint.
-
When using Databricks MLflow in a chart expression, it's important to provide the data types of the fields as the model needs to process these in the correct string/numeric format. A limitation of server-side extensions in chart expressions is that the data types are not automatically detected as in the load script.
-
If you are using a relative connection name, and if you decide to move your app from a shared space to another shared space, or if you move your app from a shared space to your private space, then it will take some time for the analytic connection to be updated to reflect the new space location.