Skip to main content Skip to complementary content

Tagging columns automatically with data classes

Categorize the metadata in the anonymized_orders database table using data classes, data profiling and sampling.

Before you begin

  • You restored the following project backup and switched the configuration: tdc_gsg_sources_files\use-case-2-using_and_contributing\repo_backup.zip. For more information, see Restoring a project backup in the repository.
  • Data sampling and profiling have been enabled and configured during the metadata import. Data sampling and profiling are required to perform the auto-tagging for data classification.
  • You have been assigned an object role with the Metadata Viewing and Data Viewing capabilities.
  • You have been assigned an object role with the Data Classification Editing capability.

Procedure

  1. In the search bar, type in anonymized_orders table.
  2. Click anonymized_orders to access its object page and open the Columns tab.
  3. On the toolbar, click Columns and select Grid as the display mode.
  4. Drag and drop the Data Classifications column from Available columns to Selected columns to display it in the Grid view.
    Talend Data Catalog has already proposed tags. Tags with dotted blue lines are not approved and tags with basic blue lines are approved.
  5. Before approving the proposed data classes for email_address, click the email_address column name to open its object page and see statistics and other information about this data.
    The data profile information displays in the Overview tab.
  6. Go back to the anonymized_orders table page and click the Data Sample tab to preview the sample data. This tab displays samples rows extracted from the dataset.
  7. Go back to the Columns tab and click the tick icon to approve the Email data class for email_address.

Results

You are ready to implement a glossary to document the enterprise vocabulary.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!