Executing the Job to label suspect pairs with assigned labels
Procedure
Results
tMatchPredict labels the suspect pairs, groups the suspect records which match the YES label and writes all the suspect pairs in the output file.
The suspect records which match the YES label belong to groups because tMatchPredict was configured to group records which match this clustering class.
The records labeled with the NO label do not belong to any group.
What to do next
You can now create a single representation of each duplicates group and merge these representations with the unique rows computed by tMatchPairing.
For an example of how to create a clean and deduplicated dataset, see Creating a clean data set from the suspect pairs labeled by tMatchPredict and the unique rows computed by tMatchPairing.
You can find an example of how to create a clean and deduplicated dataset on Talend Help Center (https://help.talend.com).