Skip to main content Skip to complementary content

Defining the set of columns to be analyzed

Before you begin

You have defined at least one database connection in the Profiling perspective of Talend Studio.

Procedure

  1. In the DQ repository tree view, expand Data Profiling and right-click Analyses > New analysis.
    The Create new analysis wizard opens.
  2. Select Table > Column set analysis and click Create..
  3. Enter a name.
  4. Optional: Set column analysis metadata (Purpose, Description) in the corresponding fields.
  5. Click Next.
  6. From the Connection menu, select the connection from the Connection list and click Next.
  7. From the Columns menu, click Select columns and select the database and the columns you want to analyze, and click OK.
    In this example, you want to analyze a set of six columns in the customer table: account number (account_num), education (education), email (email), first name (fname), second name (Iname) and gender (gender). The statistics presented in the analysis results are the row count, distinct count, unique count and duplicate count which all apply on records (values of a set of columns).
    A data preview is displayed.
    Overview of the Data Preview in the Analysis Settings tab.
  8. Click Next.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!