Skip to main content Skip to complementary content

Removing non-matching values

The email pattern used on the email column showed that some records do not respect the standard email format. You can generate a ready-to-use Job to recuperate the non-matching rows from the column.

Procedure

  1. In the Profiling perspective, click the Analysis Results tab at the bottom of the editor.
  2. In the Pattern Matching results of the email column, right-click the chart bar or the numerical results and select Generate Job.

    The Integration perspective opens showing the generated Job.

    This Job uses the Extract Transform Load process to write in two separate output files the valid/invalid email rows that match/do not match the pattern.

  3. Save the Job and press F6 to execute it.

Results

The valid and invalid rows of the email column are written in the defined output files.

You can replace the output files with different Talend components and recuperate the valid/invalid email rows and write them in databases for example.

For more information on using the Profiling perspective to identify and remove corrupt, incomplete, or inaccurate data, see Data cleansing in the Talend Studio User Guide.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!