Skip to main content

tDataprepRun

Applies a preparation made using Talend Data Preparation in a standard Data Integration Job.

tDataprepRun fetches a preparation made using Talend Data Preparation and applies it to a set of data.

Using the Deployment drop-down list in the Basic settings of the tDataprepRun component, you can select one of the two versions of the component:

  • The On-Premises deployment, that runs the preparation on the Talend Data Preparation server.
  • Availability-noteBeta
    The Cloud deployment, that runs the execution on the same engine as the Job. Your data does not leave your infrastructure. The only exchange happening is to retrieve the preparation information from Talend Cloud Data Preparation.
Information noteNote: This component is not shipped with your Talend Studio by default. You need to install the Talend Data Preparation components in the Data Integration > Components section of the Feature Manager before you can use it in your Talend Studio. For more information, see Installing features using the Feature Manager.
Information noteNote: For reference, tDataprepRun can process datasets of up to 10 million rows and 100 columns (7GB) at a speed of around 200 rows per second (150kB/s) for a 60-step preparation (these figures are indicative and may vary). For better performance or datasets beyond 10 million rows, consider using Spark Jobs.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!