In order to implement this scenario, break down the Job into four
steps:
Create the Job, define the schema for the input data, and read the input file according
to the defined schema.
Set the command to enable the output stream feature.
Map the data using the tMap component.
Output the selected data stream.
The finished Job looks as in the following image. For the detailed instructions for
designing the Job, read the following sections.
Step 1: Reading input data from a local file
You will use the tFileInputDelimited
component to read the file customers.csv for the input data. This
component can be found in the File/Input group of the
Palette.
Procedure
Place a tFileInputDelimited
component onto the design workspace, and double-click the to open the Basic settings view to set its properties.
Click the three-dot button next to the File
name/Stream field to browse to the path of the input data file. You
can also type in the path of the input data file manually.
Click Edit schema to open a
dialog box to configure the file structure of the input file.
Click the plus button to add six columns and set the Type and columns names to the values listed in the
following screenshot:
Click OK to close the dialog
box.
Step 2: Setting the command to enable the output stream feature
You will use tJava to set the command for
creating an output file and a directory that contains the output file.
Procedure
Place a tJava component onto the
design workspace, and double-click it to open the Basic
settings view to set its properties.
Fill in the Code area with the
following command:
new java.io.File("C:/myFolder").mkdirs();
globalMap.put("out_file",new java.io.FileOutputStream("C:/myFolder/customerselection.txt",false));
Information noteTip:
The command you typed in this step will create a new directory
C:/myFolder for saving the output file
customerselection.txt. You can customize the command
in accordance with actual practice.
Connect tJava to tFileInputDelimited using a Trigger > On Subjob Ok connection.
This will trigger the subJob that starts with tFileInputDelimited when tJava
succeeds in running.
Step 3: Mapping the data using the tMap component
Procedure
Place a tMap component onto the
design workspace, and double-click it to open the Basic
settings view to set its properties.
Click the three-dot button next to Map
Editor to open a dialog box to set the mapping.
Click the plus button on the left to add six columns for the schema
of the incoming data. These columns should be the same as the following:
Click the plus button on the right to add a schema of the outgoing
data flow.
Select New output and click
OK to save the output schema.
For the time being, the output schema is still empty.
Click the plus button beneath the out1 table
to add three columns for the output data.
Place the id, CustomerName and CustomerAge columns onto their respective line on the right.
Click OK to save the
settings.
Step 4: Outputing the selected data stream
Procedure
Place a tFileOutputDelimited
component onto the design workspace, and double-click it to open the Basic settings view to set its component
properties.
Select the Use Output Stream
check box to enable the Output Stream field and
fill the Output Stream field with the following
command:
(java.io.OutputStream)globalMap.get("out_file")
Information noteNote:
You can customize the command in the Output Stream field by pressing Ctrl+Space to select built-in command from the list or type
in the command into the field manually in accordance with actual practice.
In this scenario, the command we use in the Output
Stream field will call the java.io.OutputStream class to output the filtered data stream
to a local file which is defined in the Code area of tJava in
this scenario.
Connect tFileInputDelimited to
tMap using a Row > Main connection and connect tMap to
tFileOutputDelimited using a Row > out1 connection which is defined in the Map
Editor of tMap.
Click Sync columns to retrieve
the schema defined in the preceding component.
Place a tLogRow component onto
the design workspace, and double-click it to open its Basic
settings view.
Select the Table radio button in
the Mode area.
Connect tFileOutputDelimited to
tLogRow using a Row > Main connection.
Click Sync columns to retrieve
the schema defined in the preceding component.
This Job is now ready to be executed.
Press Ctrl+S to save your Job and press
F6 to run it.
The content of the selected data is displayed on the console.
The selected data is also output to the specified local file
customerselection.txt.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!