Skip to main content Skip to complementary content

Setting up the Job

Procedure

  1. Drop the following components from the Palette onto the design workspace: tFileInputDelimited, tMatchPredict and tFileOutputDelimited.
  2. Connect tFileInputDelimited to tMatchPredict using the Main link.
  3. Connect tMatchPredict to tFileOutputDelimited using the Suspect duplicates link.
  4. Check that you have defined the connection to the Spark cluster and activated checkpointing in the Run > Spark Configuration view as described in Computing suspect pairs and suspect sample from source data.
  5. Check that you have defined the connection to the Spark cluster and activated checkpointing in the Run > Spark Configuration view. For more information about selecting the Spark mode, see the documentation on Talend Help Center (https://help.talend.com).

Results

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!