Skip to main content Skip to complementary content

Refining models

Once you have created some initial models, it is important to refine them to increase their effectiveness and potential accuracy. The model scores indicate different measures of this performance. While the goal of refining the models is to increase these scores, a higher score doesn't always indicate a better model.

You can refine your models by excluding or including features, changing the training data, and modifying other configuration parameters. In doing so, you can compare different versions to see what effect your changes have.

By interpreting the scores, you will learn how to refine the model. The values for the different metrics can give you insights about which actions to take to improve the outcome.

Requirements and permissions

To learn more about the user requirements for working with ML experiments, see Working with experiments.

Configuring a new version

After you have run an experiment version, you can refine your models if needed by creating a new version.

  1. In the Model metrics table, select the model you want to refine.

  2. In the bottom right, click View configuration to open the Experiment configuration pane.

  3. Click New version.

After you create a new version, you can make changes to its configuration, such as:

  • Excluding existing features

  • Including previously excluded features

  • Changing or refreshing the dataset

  • Selecting or deselecting algorithms

More information about these options is provided in the sections below.

When drafting a new version, click the filter Filter icon under Features in the Experiment configuration pane. When filtering, you can more easily visualize which features have been introduced since you changed the training dataset. You can also see which features are auto-engineered and non-engineered.

Improving the dataset

If your model doesn't score well, you might want to review the dataset to address any issues. Read more about how to improve the dataset in Getting your dataset ready for training.

Excluding features

More features do not necessarily make a better model. To refine the model, you want to exclude unreliable and irrelevant features such as:

  • Features with too high correlation. From two correlated features, exclude the one with less feature importance.

  • Features with too low feature importance. Those features don't provide any influence on what you’re trying to learn about.

  • Features with too high feature importance. It might be due to data leakage.

Test to remove the feature from the training data, then run the training again and check if this improves the model. Does it make a big difference or none to the model score?

  1. Open an experiment from Catalog.

  2. Select the model you want to refine.

  3. In the bottom right, click View configuration to open the Experiment configuration pane.

  4. Click New version to configure a new experiment version.

  5. Under Features, clear the checkboxes for any feature that you don’t want to use in the training.

Tip noteAlternatively, you can deselect features in the schema and data views. Click Schema view to switch to the schema view. Click Data view to switch to the data view. Return to the model view by clicking Model view.

Adding features

If your model still isn’t scoring well, it could be because the features that have a relationship with the target are not yet captured in the dataset. You can re-process and re-purpose your dataset to optimize the data quality, and to add new features and information. When ready, the new dataset can be added to future experiment versions. See Changing and refreshing the dataset.

Read more about how to capture or engineer new features in Creating new feature columns.

Selecting algorithms

Based on the data type of your target column, suitable algorithms are automatically selected for training. You might want to exclude the algorithms that don't perform as well or are slower. This way you don't have to waste time on them for training.

For more information about how algorithms are chosen, see Algorithms.

  1. Open an experiment from Catalog.

  2. Select the model you want to refine.

  3. In the bottom right, click View configuration to open the Experiment configuration pane.

  4. Click New version to configure a new experiment version.

  5. Under Algorithms, clear the checkboxes for any algorithms that you don’t want to use in the training.

Changing and refreshing the dataset

If your training data has changed since the last experiment version, you can change or refresh the dataset for future versions of the experiment.

This might be helpful if you would like to compare model metrics and performance for different datasets within the same experiment. For example, this is helpful if:

  • A new set of data records is available, or updates to the original set of data records were made. For example, the latest month's transactions might have become available and appropriate for use in training, or a data collection issue might have been identified and addressed.

  • The original training dataset has been re-processed or re-purposed, perhaps with the intention of improving model training. For example, you might have improved the logic to define feature column values, or even added new feature columns.

Changing or refreshing the dataset does not alter existing models that have already been trained from previous experiment versions. Within an experiment version, the models are trained only on the training data defined within that specific version.

Requirements

When you change or refresh the dataset for a new experiment version, the new dataset must meet the following requirements:

  • The name and feature type of the target column needs to be the same as the target in the original training dataset.

  • The number of distinct values in the target column must be within the same range as required for the given experiment type. For example, for a multiclass classification experiment, the target column in the new dataset must still have between three and ten unique values. For the specific ranges, see Determining the type of model created.

The other feature columns can be entirely new, have different names, and contain different data.

Changing the dataset

  1. In the Model metrics table in an experiment, select a model.

  2. In the bottom right, click View configuration to open the Experiment configuration pane.

  3. Click New version to configure a new experiment version.

  4. Under Training data, click Change dataset.

  5. Select or upload the new dataset.

Refreshing the dataset

  1. In the Model metrics table in an experiment, select a model.

  2. In the bottom right, click View configuration to open the Experiment configuration pane.

  3. Click New version to configure a new experiment version.

  4. Under Training data, click Refresh dataset.

    You are notified if a dataset refresh is available. A dataset typically refreshes when the existing data file is overwritten by the creation of a new file with the same name.

Comparing experiment versions

Once you have made your changes, run the training again and compare the new version with the old one to see the effect of your changes.

  1. Click Run v2 in the bottom right corner of the screen to train another experiment version.

    (The text on the button depends on the number of versions you have run.)

  2. In the Model metrics table, you can filter the models using the dropdown menus for algorithm, version, and other properties. The table can also be sorted by individual metric columns.

Comparing model versions

Model metrics table showing comparison of model metrics across multiple experiment versions

Deleting experiment versions

You can delete experiment versions that you don't want to keep. Note that all models in the experiment versions will also be deleted and can't be recovered.

  1. In the Model metrics table, select a model from the experiment version you want to delete.

  2. In the bottom right, click Delete <version number>.

  3. In the confirmation dialog, click Delete.

Learn more

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!