tDeltaLakeOutput (Technical Preview)
Writes records in the Delta Lake layer of your Data Lake system in the Parquet format.
Delta Lake is an open source storage layer that brings ACID (Atomicity, Consistency, Isolation, Durability) transactions, scalable metadata handling, and unifies streaming and batch data processing to Data Lakes. To put the concept visual, data stored in Delta Lake takes the shape of versioned Parquet files with their transaction logs.
For further information, see the Delta Lake documentation on https://docs.delta.io/latest/index.html.
Depending on the Talend product you are using, this component can be used in one, some or all of the following Job frameworks:
-
Spark Batch: see tDeltaLakeOutput properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.
-
Spark Streaming: see tDeltaLakeOutput properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.