Reviewing and refining models
After the first version of the model training is finished, analyze the resulting model metrics and configure new versions of the experiment until you have achieved the results you need.
When you run the experiment version, you are taken to the Models tab, where you can start analyzing the resulting model metrics. You can access Schema view and Data view by returning to the Data tab. More granular analysis can be performed in the Compare and Analyze tabs.
You will know the first version of the training is finished when all metrics populate in the Model metrics table, and a trophy icon appears next to the top model.
Analyzing the models from v1
Switch back to the Models tab. In the Model metrics table, the top model is scored with a trophy icon. This means that it is the top-performing model based on the F1 score.
Switch back to the Models tab. In the Model metrics table, the top model is scored with a trophy icon. This means that it is the top-performing model based on the F1 score.
Sort the models by performance, from highest to lowest, by clicking the F1 column header. You might choose to exclude low-performing algorithms or focus only on the best one to get faster results in the next iteration of the training. We will address this when configuring v3 in a later section.
Identifying data leakage
Look at the Model insights charts on the right side of the page. These charts give you an indication of the relative importance of each feature, as well as model performance.
From the Permutation importance chart, as well as the Features list in the Experiment configuration pane, notice that this first iteration of the model is relying heavily on the DaysSinceLastService feature, with all other features having almost no significance compared to it.
This disparity, and the models' extremely high F1 performance scores, should be viewed as a sign that something is wrong. In this case, there was no logic defined during data collection to stop the counting of the number of days since a customer's last service ticket for customers that canceled their subscription. As a result, the model learned to associate a large number of days since last service ticket (present for customers who canceled years ago) with a value of yes in the Churned field.
This is an example of data leakage, because in a real-world scenario, the model would only have access to information up until the prediction is made, and the number of days contained in this field were collected past that point of measurement. This issue is known as target leakage, which is a form of data leakage. For more information about data leakage, see Data leakage.
We need to remove the "leaky" feature DaysSinceLastService from the experiment configuration, since it is skewing the resulting models. Note that in a real-life use case, there needs to be thorough investigation of the data quality and logic, prior to model creation, to ensure that the resulting model is trained properly.
We will address this issue when configuring v2.
Configuring and running version 2
Let's configure a new version to address the data leakage.
Do the following:
-
Click View configuration to expand the experiment configuration panel.
-
Click New version.
-
In the panel, under Features, clear the DaysSinceLastService checkbox.
-
Click Run v2.
Analyzing the models from v2
After the second version of the experiment has finished running, click the checkbox next to the top-performing v2 model in the Model metrics table (marked with a trophy icon). This refreshes the page with the metrics for that model.
Comparing training and holdout metrics
You can view additional metrics and compare the metrics from the cross-validation training to the holdout metrics.
Do the following:
-
In the experiment, switch to the Compare tab.
An embedded analysis opens. You can use the interactive interface to dive deeper into your comparative model analysis and uncover new insights.
-
In the Sheets panel on the right side of the analysis, switch to the Details sheet.
-
Look at the Model Metrics table. It shows model scoring metrics, such as F1, as well as other information.
-
Version 1 of training was effected by target leakage, so let's focus only on v2. Use the Version filter pane on the right side of the sheet to select the value 1.
-
In the Columns to show section, use the filter pane to add and remove columns in the table.
-
In the drop down listbox, add additional metrics. Training scores for each metric are shown as values ending in Train. Add some training metrics to the table.
You can now see the F1 metrics from the cross-validation training and compare them to the holdout metrics.
Identifying features with low importance
Next, we should check to see if there are any features with low permutation importance. Features that have little to no influence on the model should be removed for improved prediction accuracy.
Do the following:
-
In the experiment, switch back to the Models tab.
-
Look at the Permutation importance chart. The bottom four features—StartMonth, DeviceType, CustomerTenure, and Territory—provide much less influence on our model than the other features. They are of little value for this use case and can be seen as statistical noise.
In v3, we can remove these features to see if it improves the model scores.
Identifying low-performing algorithms
We can also look at the Model metrics table to see if we can remove any algorithms from the v3 training. You can remove low-performing algorithms when refining models so that the training runs faster in subsequent iterations.
-
In the experiment, switch back to the Models tab.
-
In the Model metrics table, use the Version filter to show only the models from v2.
-
Look at the F1 scores for each Algorithm. If certain algorithms are creating models that score significantly lower than others, we can remove them from the next version.
Configuring and running version 3
Do the following:
-
Click View configuration to expand the experiment configuration panel.
-
Click New version.
-
In the panel, under Features, clear the checkboxes for StartMonth, DeviceType, CustomerTenure, and Territory.
-
Optionally, expand Algorithms and clear the checkboxes for Gaussian Naive Bayes and Logistic Regression.
-
Click Run v3.
Analyzing the models from v3
After v3 has run, you can clear the Version filter from the Model metrics table. Select the top-performing model from v3.
Let's do some quick comparison of the models across all versions.
The first version of the training resulted in the highest scores, but these metrics were highly exaggerated and unrealistic predictors of performance which were caused by the data leakage issue. In v3, the F1 score of the top-performing model increased from that of the top-performing v2 model.
As explored earlier, you can switch to the Compare tab for deeper comparison of model scores.
Focusing on a specific model
At any point during model analysis, you can perform granular analysis of an individual model. Explore prediction accuracy, feature importance, and feature distribution with an interactive Qlik Sense experience.
Do the following:
-
With the top-performing v3 model selected, click the Analyze tab.
An embedded analysis opens.
-
With the Model Overview sheet, you can analyze the prediction accuracy of the model. Analysis is enhanced by the power of selections. Click a feature or predicted value to make a selection. The data in the embedded analysis adjusts to filter the data. You can drill down in specific feature values and ranges to view how the feature influence and prediction accuracy change.
-
Switching to the other sheets, you can view visualizations for prediction accuracy, feature distribution, and impact distribution (SHAP). This analytics content can help you to:
-
Uncover the key drivers influencing trends in the data.
-
Identify how specific features and cohorts are affecting predicted values and prediction accuracy.
-
Identify outliers in the data.
-
Next steps
In a real-world scenario, it is important to repeat these refining steps as many times as needed before deploying your model, to ensure that you have the best possible model for your particular use case.
In this tutorial, move to the next section about deploying your model.