Skip to main content Skip to complementary content

Adding a new compound semantic type

You can create a compound semantic type which references other semantic types that are published on Talend Dictionary Service and add it to the list of recognized data types in the data models in Talend Data Stewardship.

You can mix all semantic types when creating a compound type, and a compound semantic type can reference other compound types on the condition that all children types are already published.

Let's say that you have a file which holds information about customers from US, UK, Germany and France. You need to intervene and validate the different zip codes against a compound semantic type you create. Once data matches one of the child types, it is considered as valid and it is not evaluated against the other referenced types.

When defining the data model in Talend Data Stewardship, you can set the semantic type for the column containing the zip codes to this new compound type, Zip_codes in this example.

Before you begin

All the children semantic types you want to use in the compound type are created and published.

Procedure

  1. Select SEMANTIC TYPES > ADD SEMANTIC TYPE.
  2. Enter a name and a description for the new semantic type.
  3. Select the semantic type from the Type list.
  4. Keep the Use for validation switch activated.

    This compound type will be used to define which values are considered right or wrong when applied on a given column. The result of this validation process can be seen in the quality bar of each column in your datasets.

    In this example, if you were to deactivate the switch, the compound type would only be used for data discovery, and no value would be considered invalid.

  5. From the Children types list, select the semantic types you want to group in this compound type.
  6. Click SAVE AND PUBLISH to send the semantic type to the Talend Dictionary Service server and make it available to be used by Data Stewardship.
    Clicking SAVE AS DRAFT stores the new type on the server without propagating it to the system. The new type is not usable unless it is published. For a use case of this option, let's say that you have new semantic types to deploy as part of a new project. You can prepare the work by creating the semantic types and save them as draft before the go-live of the project, and can deploy the semantic types only the day of go-live.
  7. From the DATA MODELS page, create the data model for the customers data.
    The new semantic category Phone_numbers is available now in the list of semantic types and you can set it for the column containing the phone numbers.

Results

When you load the customer data to Talend Data Stewardship, data is matched and validated against the Phone_numbers compound type you created. Data is evaluated against the first child type and if data matches it is not evaluated against the other referenced types and so on.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!