Skip to main content Skip to complementary content

Cardinality

The cardinality is the uniqueness of data values in a column. Columns with too many unique values or no unique values have little variance. A machine learning model can't identify any sort of pattern in that data.

A high cardinality means a high number of unique values. To prevent high cardinality, you can bin, or group, similar values. You can also create new features columns, for example, home addresses could be turned into distances to or from a specific location.

A column with only one unique value (constant) is not useful in identifying patterns.

Related learning:

Learn more

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!