Double-click tDataShuffling to display the
Basic settings view and define the component
properties.
Click Sync columns to retrieve the schema
defined in the input component.
In the Shuffling columns table, click the
[+] button to add four rows, and then:
in the Column, select the columns
where data will be shuffled,
in the Group ID, select the group
identifier for each column. The columns having the same group identifier
are shuffled together.
In the above example, there are two groups of columns to be
shuffled:
Group ID 1: credit_card
Group ID 2: lname, fname and mi
The Job will replace credit card numbers within the credit_card column with values from different rows. It will also keep
last names, first names and middle initial values, from the lname, fname and mi
columns together and replace them with values from different rows.
Click the Advanced settings tab.
In the Partitioning columns table, click the
[+] button to add one row.
The Job will shuffle the original data rows sharing the same value for the
partitioning columns.
In the above example, the component is configured to apply the shuffling
process to the rows sharing the same value for the country column.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!