Export options and runtimes matrix
In Talend Data Preparation, two runtimes are available to process your data when you export a preparation:
- A Java runtime
- A Big Data runtime based on Apache Beam, available when using Talend Data Preparation in a Big Data context
Depending on the data source, and the export target, the runtime used will vary, and you may or may not be able to export preparation made on large datasets. For more information, see Working on large datasets.
The Spark Job Server and the Streams Runner are the two components necessary to process an export in a Big Data context. For a better comprehension of how the export is processed when using Big Data on Spark, see the Talend Data Preparation architecture.
the different behaviors and possibilities are listed in the table below.
Input/Output |
Local CSV/Excel/Tableau file |
HDFS file |
Amazon S3 |
---|---|---|---|
Local CSV/Excel/Tableau file |
Java runtime |
Not available |
Java runtime |
Talend Job |
Java runtime |
Not available |
Java runtime |
JDBC |
Java runtime |
Big Data runtime |
Big Data runtime if available, Java runtime otherwise |
HDFS |
Java runtime |
Big Data runtime |
Big Data runtime |
Amazon S3 |
Java runtime |
Big Data runtime |
Big Data runtime if available, Java runtime otherwise |
Salesforce |
Java runtime |
Not available |
Java runtime |