Processing (Integration) components

tCacheIn	Offers faster access to the persistent data.
tCacheOut	Persists the input RDDs depending on the specific storage level you define in order to offer faster access to these datasets later.
tExtractDynamicFields	Parses a Dynamic column to create standard output columns.
tExtractEDIField	Reads the EDI structured data from an EDIFACT message file, generates an XML according to the EDIFACT family and the EDIFACT type, extracts data by parsing the generated XML using the XPath queries manually defined or coming from the Repository wizard, and finally sends the data to the next component via a Row connection.
tExtractRegexFields	Extracts data and generates multiple columns from a formatted string using regex matching.
tSample	Returns a sample subset of the data being processed.
tSqlRow	Performs SQL queries over input datasets.
tTop	Sorts data and outputs several rows from the first one of this data.
tTopBy	Groups and sorts data and outputs several rows from the first one of the data in each group.
tWindow	Applies a given Spark window on the incoming RDDs and sends the window-based RDDs to its following component.
tWriteAvroFields	Transforms the incoming data into Avro files.
tWriteDelimitedFields	Converts records into byte arrays.
tWriteDynamicFields	Creates a dynamic schema from input columns in the component.
tWritePositionalFields	Converts records into byte arrays.
tWriteXMLFields	Converts records into byte arrays.
tAggregateRow	Receives a flow and aggregates it based on one or more columns.
tAggregateSortedRow	Aggregates the sorted input data for output column based on a set of operations. Each output column is configured with many rows as required, the operations to be carried out and the input column from which the data will be taken for better data aggregation.
tConvertType	Converts one Talend java type to another automatically, and thus avoid compiling errors.
tDenormalize	Denormalizes the input flow based on one column.
tDenormalizeSortedRow	Synthesizes sorted input flow to save memory.
tExternalSortRow	Sorts input data based on one or several columns, by sort type and order, using an external sort application.
tExtractDelimitedFields	Generates multiple columns from a delimited string column.
tExtractJSONFields	Extracts the desired data from JSON fields based on the JSONPath or XPath query.
tExtractPositionalFields	Extracts data and generates multiple columns from a formatted string using positional fields.
tExtractXMLField	Reads the XML structured data from an XML field and sends the data as defined in the schema to the following component.
tFilterColumns	Homogenizes schemas either by ordering the columns, removing unwanted columns or adding new columns.
tFilterRow	Filters input rows by setting one or more conditions on the selected columns.
tJoin	Performs inner or outer joins between the main data flow and the lookup flow.
tNormalize	Normalizes the input flow following SQL standard to help improve data quality and thus eases the data update.
tPartition	Allows you to visually define how an input dataset is partitioned.
tReplace	Cleanses all files before further processing.
tReplicate	Duplicates the incoming schema into two identical output flows.
tSampleRow	Selects rows according to a list of single lines and/or a list of groups of lines.
tSortRow	Helps creating metrics and classification table.
tSplitRow	Splits one input row into several output rows.
tUniqRow	Ensures data quality of input or output flow in a Job.
tUnite	Centralizes data from various and heterogeneous sources.
tWriteJSONField	Transforms the incoming data into JSON fields and transfers them to a file, a database table, etc.

tCacheIn

Offers faster access to the persistent data.

tCacheOut

Persists the input RDDs depending on the specific storage level you define in order to offer faster access to these datasets later.

tExtractDynamicFields

Parses a Dynamic column to create standard output columns.

tExtractEDIField

Reads the EDI structured data from an EDIFACT message file, generates an XML according to the EDIFACT family and the EDIFACT type, extracts data by parsing the generated XML using the XPath queries manually defined or coming from the Repository wizard, and finally sends the data to the next component via a Row connection.

tExtractRegexFields

Extracts data and generates multiple columns from a formatted string using regex matching.

tSample

Returns a sample subset of the data being processed.

tSqlRow

Performs SQL queries over input datasets.

tTop

Sorts data and outputs several rows from the first one of this data.

tTopBy

Groups and sorts data and outputs several rows from the first one of the data in each group.

tWindow

Applies a given Spark window on the incoming RDDs and sends the window-based RDDs to its following component.

tWriteAvroFields

Transforms the incoming data into Avro files.

tWriteDelimitedFields

Converts records into byte arrays.

tWriteDynamicFields

Creates a dynamic schema from input columns in the component.

tWritePositionalFields

Converts records into byte arrays.

tWriteXMLFields

Converts records into byte arrays.

tAggregateRow

Receives a flow and aggregates it based on one or more columns.

tAggregateSortedRow

Aggregates the sorted input data for output column based on a set of operations. Each output column is configured with many rows as required, the operations to be carried out and the input column from which the data will be taken for better data aggregation.

tConvertType

Converts one Talend java type to another automatically, and thus avoid compiling errors.

tDenormalize

Denormalizes the input flow based on one column.

tDenormalizeSortedRow

Synthesizes sorted input flow to save memory.

tExternalSortRow

Sorts input data based on one or several columns, by sort type and order, using an external sort application.

tExtractDelimitedFields

Generates multiple columns from a delimited string column.

tExtractJSONFields

Extracts the desired data from JSON fields based on the JSONPath or XPath query.

tExtractPositionalFields

Extracts data and generates multiple columns from a formatted string using positional fields.

tExtractXMLField

Reads the XML structured data from an XML field and sends the data as defined in the schema to the following component.

tFilterColumns

Homogenizes schemas either by ordering the columns, removing unwanted columns or adding new columns.

tFilterRow

Filters input rows by setting one or more conditions on the selected columns.

tJoin

Performs inner or outer joins between the main data flow and the lookup flow.

tNormalize

Normalizes the input flow following SQL standard to help improve data quality and thus eases the data update.

tPartition

Allows you to visually define how an input dataset is partitioned.

tReplace

Cleanses all files before further processing.

tReplicate

Duplicates the incoming schema into two identical output flows.

tSampleRow

Selects rows according to a list of single lines and/or a list of groups of lines.

tSortRow

Helps creating metrics and classification table.

tSplitRow

Splits one input row into several output rows.

tUniqRow

Ensures data quality of input or output flow in a Job.

tUnite

Centralizes data from various and heterogeneous sources.

tWriteJSONField

Transforms the incoming data into JSON fields and transfers them to a file, a database table, etc.

Did this page help you?