Generating a matching model from a Grouping campaign
This scenario applies only to subscription-based Talend Platform products with Big Data and Talend Data Fabric.
tMatchModel reads the sample of suspect pairs computed on a list of duplicate childhood education centers and labeled by data stewards in Talend Data Stewardship. It generates several matching models, searches the best combination of the learning parameters and keeps the best matching model which comes out as the result of cross validation.
Prerequisites:
- You have generated the suspect data pairs by using the tMatchPairing component and labeled them in
Talend Data Stewardship. For further
information, see Computing suspect pairs and writing a sample in
.
For further information about handling grouping tasks to decide on relationship among pairs of records, see Talend Data Stewardship Examples.