Skip to main content

Pig components

tPigAggregate Adds one or more additional columns to the output of the grouped data to create data to be used by Pig.
tPigCode Extends the functionalities of a Talend Job through using Pig scripts.
tPigCoGroup Performs the Pig COGROUP operation to group and aggregate data incoming from multiple Pig flows.
tPigCross Uses CROSS operator to compute the Cartesian product.
tPigDistinct Removes duplicate tuples in a relation.
tPigFilterColumns Selects data or filters out data from a relation based on defined filter conditions.
tPigFilterRow Applies filtering conditions on one or more specified columns, in a Pig process, in order to split or filter data from a relation.
tPigJoin Performs inner joins and outer joins of two files based on join keys to create data that will be used by Pig.
tPigLoad Loads original input data to an output stream in just one single transaction, once the data has been validated.
tPigMap Transforms and routes data from single or multiple sources to single or multiple destinations.
tPigReplicate Performs different operations on the same schema.
tPigSort Sorts relation based on one or more defined sort keys.
tPigStoreResult Stores the result of your Pig Job into a defined data storage space.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!