Advanced mapping using filters, explicit joins and rejections
This scenario introduces a Job that allows you to find BMW owners who have two to six children (inclusive), for sales promotion purpose for example.
Linking the components
Procedure
- Drop three tFileInputDelimited components, a tMap component, and two tFileOutputDelimited components from the Palette onto the design workspace, and label them to best describe their functions.
-
Connect the input components to the tMap
using Row > Main connections.
Pay attention to the file you connect first as it will automatically be set as Main flow, and all the other connections will be Lookup flows. In this example, the connection for the input component Owners is the Main flow.
Configuring the components
Procedure
-
Define the properties of each input components in the respective Basic settings view. Define the properties of
Owners.
-
Select Repository from the Property type list and select the component's
schema, owners in this scenario, from the Repository Content dialog box. The rest fields
are automatically filled.
Information noteNote:
In this scenario, the input schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further information regarding metadata creation in the Repository, see Managing metadata in Talend Studio.
In the same way, set the properties of the other input components: Cars and Resellers. These two Lookup flows will fill in secondary (lookup) tables in the input area of the Map Editor. -
Then double-click the tMap component to
launch the Map Editor and define the
mappings and filters.
Set an explicit join between the Main flow Owner and the Lookup flow Cars by dropping the ID_Owner column of the Owners table to the ID_Owner column of the Cars table.The explicit join is displayed along with a hash key.
-
In the Expr. Key field of the
Make column, type in a filter. In this use case,
simply type in "BMW" as the search is focused on the owners of
this particular make.
- Implement a cascading join between the two lookup tables Cars and Resellers on the ID_Reseller column in order to retrieve resellers information.
-
As you want to reject the null values into a separate table and exclude
them from the standard output, click the tMap
settings button and set Join
Model to Inner Join in each
of the Lookup tables.
-
In the tMap settings, you can set Match
Model to Unique match,
First match, or All matches. In this use case, the All
matches option is selected. Thus if several matches are found
in the Inner Join, rows matching the explicit join as well as the filter,
all of them will be added to the output flow (either in rejection or the
regular output).
Information noteNote:
The Unique match option functions as a Last match. The First match and All matches options function as named.
- On the output area of the Map Editor, click the plus button to add two tables, one for the full matches and the other for the rejections.
- Drag all the columns of the Owners table, the Registration, Make and Color columns of the Cars table, and the ID_Reseller and Name_Reseller columns of the Resellers table to the main output table.
- Drag all the columns of the Owners table to the reject output table.
-
Click the Filter button at the top of the
main output table to display the Filter
expression area.
Type in a filter statement to narrow down the number of rows loaded in the main output flow. In this use case, the statement reads: Owners.Children_Nr >= 2 && Owners.Children_Nr <= 6.
-
In the reject output table, click the tMap
settings button and set the reject types.
Set Catch output reject to true to collect data about BMW car owners who have less than two or more than six children.Set Catch lookup inner join reject to true to collect data about owners of other car makes and owners for whom the reseller information is not found.Click OK to validate the mappings and close the Map Editor.On the design workspace, right-click the tMap and pull the respective output link to the relevant output components.
-
Define the properties of the output components in their respective
Basic settings view.
In this use case, simple specify the output file paths and select the Include Header check box, and leave the other parameters as they are.
Executing the Job
Procedure
- Press Ctrl + S to save your Job.
-
Press F6 to run it.
The main output file contains the information related to BMW owners who have two to six children, and the reject output file contains the information about the rest of the car owners.
For examples of how to use dynamic schemas with tMap, see:
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!