Skip to main content Skip to complementary content
Close announcements banner

Scoring multiclass classification models

Multiclass classification is when you are trying to predict a single discrete outcome as in binary classification, but with more than two classes. Multiclass classification models are scored by different averages of F1.

During the training of a multiclass classification experiment, the following charts are auto-generated to provide quick analysis of the generated models:

  • Permutation importance: A chart in which features are displayed in order from highest influence (biggest impact on model performance) to lowest influence (smallest impact on model performance). For more information, see Permutation importance.

  • SHAP importance: A chart representing how much each feature influences the predicted outcome. For more information, see SHAP importance in experiment training.

Macro F1

Macro F1 is the averaged F1 value for each class without weighting, that is, all classes are treated equally.

Micro F1

Micro F1 is the F1 value calculated across the entire confusion matrix. The total true positives, false negatives, and false positives are counted. Calculating the Micro F1 score is equivalent to calculating the global precision or the global recall.

Weighted F1

Weighted F1 corresponds to the binary classification F1. It is calculated for each class and then combined as a weighted average taking into account the number of records for each class.

Accuracy

Accuracy measures how often the model made a correct prediction on average. It is calculated as the number of exactly matching predictions divided by the number of samples.

Learn more

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!