Transform
The Transform package modifies data by applying user defined functions and operators onto data columns (fields).
Once source entities are added to the canvas, the Transform package is brought onto the canvas by double-clicking or dragging the icon. When the square icon becomes a live controller on the canvas, double-click to reveal a field mapping panel where fields can be added, deleted, or converted. Custom expressions can be built and applied to each field.
Expand or shrink the field definition window by moving size control. Create more space to Add Fields.
Auto Mapping: Once source and target entities are chosen/created, connect the source outport to the Transform package inport and the Transform package outport to the target entity inport. Select Auto Mapping to map the fields within the control package on the UI.
Auto Mapping populates IN and OUT fields.
Select the triangle directly to the right of source fields to access Select Fields popup. This functionality allows users to select multiple fields for operations and functions.
Select and access the expression builder to define new conditions for target fields. For more information see Custom Expression Builder.
Build a custom expression for each field to be transformed, Validate each expression. Select OK. The function executes as a Pig script.
Transform operations in this example:
The field NAME is being transformed with the UPPER Pig function to return the string with upper case letters.
Custom expression entry: UPPER(NAME)
The field 'MAJOR' is being transformed with a Pig ternary operator to return 'Fine Art' if the value is 'Art', otherwise it returns 'Not Fine Art'.
Custom Expression Builder entry: (Major=='Art'?:'Fine Art':'Not Fine Art')
Validate and Execute the transformation (where the transformed data is saved to the target entity).
The target entity tab on the prepare canvas provides sample data once the data has been loaded upon successful validation and execution. Target entities will also be located in the discovery module once created.
The below example shows sample data from the source entity (Student_Major) and target entity (Student_Major_Tr8) where the NAME field was transformed to UPPER case characters and the 'Major' field was transformed so that an 'Art' value became 'Fine Art' and every other major now has a value 'Not Fine Art'.
OUTPUT/Target entity (Student_Major_Tr8)
CSV Upload for Transform
Analysts may find the need to re-use Transform expression definitions across different dataflows. For these use cases, they are able to upload a pre-built CSV from a local machine to generate desired transforms. The following example illustrates an uploaded set of Transform expressions where source entity EMPLOYEES fields are transformed by expressions uploaded as a CSV file.
Select the CSV file on local machine for upload. CSVs are created by entering the function expression in the IN column (column A) and naming OUT Field (column B) exactly as they would be entered manually in Transform criteria fields.
The current example demonstrates application of the following Transforms:
EMPLOYEE_ID: No Transform
LAST_NAME: Converted to UPPER case
EMAIL: One EMAIL ('hr@acme.com') will be entered for all records
PHONE_NUMBER: The entered PHONE_NUMBER displays for all records
HIRE_DATE: Transformed date will be transformed by RollForwardToFridayOrEndOfMonth function so that value for that field will become the date for the first Friday of the week of the Hire Date or the last business day of the month (whichever comes first)
JOB-ID: No Transform
SALARY: All employee salaries increase by 500,000 dollars
The CSV populates the fields. If the application encounters mismatched fields or inaccuracies, errors will indicate which fields do not map correctly. Validate the mappings, select Apply.
Once the target entity is added and the Transforms are executed, target entity fields are transformed identically as if the expressions had been created manually.