Skip to main content Skip to complementary content

Reviewing and refining model versions

After the first version of the model training is finished, analyze the resulting model metrics and configure new versions of the experiment until you have achieved the results you need.

When you run the experiment version, you are taken to model view, where you can analyze the resulting model metrics. You can switch to schema or data view at any time. When you need to return to model view, click the model view Object icon.

You will know the first version of the training is finished when all metrics populate in the Model metrics table, and a trophy Trophy icon appears next to the top model.

Information noteAutoML is continually improving its model training processes. Therefore, you might notice that the model metrics and other details shown in the images on this page are not identical to yours when you complete these exercises.

Analyzing the model

In model view, we can see that the top algorithm is scored with a trophy Trophy icon. This means that it is the top-performing model based on the F1 score.

Model view showing top-performing v1 model.

Model view showing model metrics.
  1. In the top right of the table, click the column picker Columns button. Here, you can view all the available metrics for our problem, and add or remove metrics as needed. Select any metrics you want to show in the table, or leave the default metrics.

    Use the column picker to add or remove metrics in the Model metrics table

    Column picker in Model metrics table
  2. In the Model metrics table, click the Algorithm filter dropdown and select the algorithm corresponding to the top-performing model.

  3. Toggle on Show training data metrics.

    You can now see the metrics from the cross-validation training and compare them to the holdout metrics. For each holdout metric column, there is a corresponding 'train' column for the equivalent metric from the training data.

    Model metrics table with training data metrics displayed

    Training metrics shown under the holdout metrics.
  4. Click Clear filters and switch the Show training data metrics toggle back to off.

  5. Sort the models by performance, from highest to lowest, by clicking the F1 column header. You might choose to exclude low-performing algorithms or focus only on the best one to get faster results in the next iteration of the training. We will address this when configuring v3 in a later section.

  6. Scroll down below the metrics table to see visualizations for the selected model.

    Model metrics table and visualizations

    Model metrics table and visualizations.
  7. Click Experiment configuration pane or View configuration to expand the Experiment configuration pane.

  8. Click New version to create a draft of the next experiment version.

  9. From the Permutation importance chart, as well as the Features list in the Experiment configuration pane, notice that this first iteration of the model is relying heavily on the DaysSinceLastService feature, with all other features having almost no significance compared to it.

    Permutation importance metrics in the experiment configuration pane, noting the influence of feature DaysSinceLastService

    Features list in experiment configuration pane, showing the 'DaysSinceLastService' feature as having a disproportionately large influence on the experiment.

    This disparity, and the models' extremely high performance, should be viewed as a sign that something is wrong. In this case, there was no logic defined during data collection to stop the counting of the number of days since a customer's last service ticket for customers that canceled their subscription. As a result, the model learned to associate a large number of days since last service ticket with a value of yes in the Churned field.

    This is an example of data leakage, because in a real-world scenario, the model would only have access to information up until the prediction is made, and the number of days contained in this field were collected past that point of measurement. For more information about data leakage, see Data leakage.

    We need to remove the "leaky" feature DaysSinceLastService from the experiment configuration, since it is skewing the resulting models. Note that in a real-life use case, there needs to be thorough investigation of the data quality and logic, prior to model creation, to ensure that the resulting model is trained properly.

    We will address this issue in the next section, when configuring v2.

Configuring and running version 2

Since most of the model training will change after this data leakage issue is fixed, let's configure a new version before completing any further refinements.

  1. From a previous step, you already have the Experiment configuration pane opened for configuring v2.

  2. Under Features in the Experiment configuration pane, clear the DaysSinceLastService checkbox.

  3. Click Run v2.

Configuring and running version 3

After the second version of the experiment has finished running, click the checkbox next to the top-performing v2 model in the metrics table (marked with a trophy Trophy icon). This refreshes the page with the metrics for that model.

Above the Model metrics table, click the Version filter dropdown and select 2. This allows you to focus only on the v2 model metrics.

You will see that the list of important features has changed substantially since addressing the data leakage. The top-performing model might also use a different algorithm than the top-performing model for v1.

Model metrics table showing top-performing models for v2, sorted by F1 score

Model metrics table with 'v2' Version filter applied after training of v2.
  1. Look at the Permutation importance chart. There might be features that provide much less influence on our model than the other features. They are of little value for this use case and can be seen as statistical noise. You can try to remove some of those features to see if it improves the model scores.

    Permutation importance chart for top-performing v2 model

    Permutation importance chart after removing leaky feature DaysSinceLastService.
  2. Click Experiment configuration pane or View configuration to expand the Experiment configuration pane.

  3. Click New version to create a draft of the next experiment version.

  4. In the Experiment configuration pane, under Features, clear the checkboxes for one or more features that are exerting little to no influence on the model.

  5. Look at the Model metrics table. You might choose to exclude some low-performing algorithms or focus only on the best ones to get faster results in the next iteration of the training.

  6. In the Experiment configuration pane, under Algorithms, optionally clear the checkboxes for a few of the low-performing algorithms.

  7. Click Run v3.

Comparing experiment versions

In the Model metrics table, click Clear filters.

After v3 has run, click the checkbox next to the top-performing v3 model to view its metrics.

Click More model filters, and select the Top performers filter. You can see metrics for the top performers of each iteration of the experiment.

The first version of the training resulted in the highest scores, but these metrics were highly exaggerated and unrealistic predictors of performance which were caused by the data leakage issue. In v3, the F1 score of the top-performing model increased from that of the top-performing v2 model.

Top-performing models

Model metrics table with 'Top performers' filter applied, to show the top-performing model for v3.

In a real-world scenario, it is important to repeat these refining steps as many times as needed before deploying your model, to ensure that you have the best possible model for your particular use case.

In this tutorial, move to the next section about deploying your model.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!