Define the match analysis

Procedure

From the Profiling perspective, right-click Metadata and create a file connection to the duplicated_records output file generated by the Job.
For further information, check the Data Profiling part in the Talend Studio User Guide.
Expand the new file connection under Metadata and select Analyze matches.
Follow the steps in the wizard to define the analysis metadata and click Finish to open the analysis editor.
In the Matching Key table, define a match key on the Code column to group records by their identification, records which have the same code are grouped together.
Click Chart below the table to show the duplicates generated according to the Bernoulli distribution selected previously in the Job.

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!