Configuring a data-detected class
Before you begin
Procedure
- Go to .
-
Click the data class to configure its properties.
Field Description Enable/Disable Select Enable to include this data class in the next data classification operation.
Select Disable to not include this data class in the next data classification operation.Information noteNote: Even though a data class is disabled, it may still be manually assigned and unassigned from an imported object.Name Type in the name of the data class. Description Type in a description. Classification groups Select one or more classification groups for this data class. You can classify the data of a model by group.
Term Select a glossary term to associate it to any object classified with this data class.
The term provides information such as the name and description, when tracing the semantic definition from any object associated with the data class.
You can also obtain the list of all the objects associated with this data class when tracing the semantic usage from the term.
Default sensitivity Select a sensitivity label to assign it to any object classified with this data class.
The sensitivity label assignment can control the display of data profiling and sampling information on the object pages. By default, you can see the information when you have an assigned object role with the data viewing capability. If a sensitivity label with the Hide Data option enabled is assigned, you cannot see the information as a data viewer.
Auto learning Enabling this option allows the data class to be auto-populated with a pattern based on existing imported objects. Matching threshold (%) Enter a value to specify the minimum percentage of values matching any of the enumeration values, patterns or regular expression among all values (of that field/column). Uniqueness threshold Enter a value to specify the minimum number of unique values among all values (of that field/column) to require enough diversity for the dataset. By default, the value is set to 6 on patterns and regular expressions.
By default, the value is set to 1 on enumerations and limited to the maximum number of possible values in the enumeration list. If the number of the possible values is less than the one specified in the Uniqueness threshold field, Talend Data Catalog still uses the maximum number of possible values as the value for the Uniqueness threshold field.
Information noteNote:If you use an "International" enumeration data class including values in different languages and you have a column that uses one or more values of this data class in only one language, Talend Data Catalog will match it with confidence less than 100% because of other languages.
It is recommended to use "International" data classes only when you have multilingual columns. In other case, you should define a data class for each language used and group them in an "International" compound data class.
Data pattern Select the type: - Enumeration: list of valid values for the data of that data class.
- Pattern: patterns for the data of that data class.
- Regular Expression: expression syntax that the data should conform to for that data class.
Possible Values
Data patterns
Regular expression
Enter the list of possible values, the data patterns or the regular expression. - Save your changes.