tAggregateRow
|
Receives a flow and aggregates it based on one or more columns. |
tAggregateSortedRow
|
Aggregates the sorted input data for output column based on a set of operations. Each
output column is configured with many rows as required, the operations to be carried out and
the input column from which the data will be taken for better data aggregation. |
tCacheIn
|
Offers faster access to the persistent data. |
tCacheOut
|
Persists the input RDDs depending on the specific storage level you define in order
to offer faster access to these datasets later. |
tConvertType
|
Converts one Talend java type to another automatically, and thus avoid
compiling errors. |
tDenormalize
|
Denormalizes the input flow based on one column. |
tDenormalizeSortedRow
|
Synthesizes sorted input flow to save memory. |
tExternalSortRow
|
Sorts input data based on one or several columns, by type (numeric or alphabetical)
and order (ascendant or descendent), using an external sort application. |
tExtractDelimitedFields
|
Generates multiple columns from a delimited string column. |
tExtractDynamicFields
|
Parses a Dynamic column to create standard output columns. |
tExtractEDIField
|
Reads the EDI structured data from an EDIFACT message file, generates an XML
according to the EDIFACT family and the EDIFACT type, extracts data by parsing the generated
XML using the XPath queries manually defined or coming from the Repository wizard, and finally
sends the data to the next component via a Row connection. |
tExtractJSONFields
|
Extracts the desired data from JSON fields based on the JSONPath or XPath
query. |
tExtractPositionalFields
|
Extracts data and generates multiple columns from a formatted string using
positional fields. |
tExtractRegexFields
|
Extracts data and generates multiple columns from a formatted string using regex
matching. |
tExtractXMLField
|
Reads the XML structured data from an XML field and sends the data as defined in
the schema to the following component. |
tFilterColumns
|
Homogenizes schemas either by ordering the columns, removing unwanted columns or
adding new columns. |
tFilterRow
|
Filters input rows by setting one or more conditions on the selected
columns. |
tJoin
|
Performs inner or outer joins between the main data flow and the lookup
flow. |
tNormalize
|
Normalizes the input flow following SQL standard to help improve data quality and
thus eases the data update. |
tPartition
|
Allows you to visually define how an input dataset is partitioned. |
tReplace
|
Cleanses all files before further processing. |
tReplicate
|
Duplicates the incoming schema into two identical output flows. |
tSample
|
Returns a sample subset of the data being processed. |
tSampleRow
|
Selects rows according to a list of single lines and/or a list of groups of
lines. |
tSortRow
|
Helps creating metrics and
classification table. |
tSplitRow
|
Splits one input row into several output rows. |
tSqlRow
|
Performs SQL queries over input datasets. |
tTop
|
Sorts data and outputs several rows from the first one of this data. |
tTopBy
|
Groups and sorts data and outputs several rows from the first one of the data in
each group. |
tUniqRow
|
Ensures data quality of input or output flow in a Job. |
tUnite
|
Centralizes data from various and heterogeneous sources. |
tWindow
|
Applies a given Spark window on the incoming RDDs and sends the window-based RDDs
to its following component. |
tWriteAvroFields
|
Transforms the incoming data into Avro files. |
tWriteDelimitedFields
|
Converts records into byte arrays. |
tWriteDynamicFields
|
Creates a dynamic schema from input columns in the component. |
tWriteJSONField
|
Transforms the incoming data into JSON fields and transfers them to a file, a
database table, etc. |
tWritePositionalFields
|
Converts records into byte arrays. |
tWriteXMLFields
|
Converts records into byte arrays. |