Transforming a list of files as data flow
The following scenario describes a Job that iterates on a list of files, picks up the filename and current date and transforms this into a flow, that gets displayed on the console.
data:image/s3,"s3://crabby-images/44f29/44f2929474b06c9a2a4f5ef5ffdd2b2d9d82f720" alt=""
-
Drop the following components: tFileList, tIterateToFlow and tLogRow from the Palette to the design workspace.
-
Connect the tFileList to the tIterateToFlow using an iterate link and connect the Job to the tLogRow using a Row main connection.
-
In the tFileList Component view, set the directory where the list of files is stored.
data:image/s3,"s3://crabby-images/1d1ec/1d1ecaeb876ac7614a4c8db4b55f10ffed279b4f" alt=""
-
In this example, the files are three simple .txt files held in one directory: Countries.
-
No need to care about the case, hence clear the Case sensitive check box.
-
Leave the Include Subdirectories check box unchecked.
-
Then select the tIterateToFlow component et click Edit Schema to set the new schema
data:image/s3,"s3://crabby-images/70014/70014d66f7d7c66271f9587f73f92807870d4168" alt=""
-
Add two new columns: Filename of String type and Date of date type. Make sure you define the correct pattern in Java.
-
Click OK to validate.
-
Notice that the newly created schema shows on the Mapping table.
data:image/s3,"s3://crabby-images/d4aee/d4aee314362cdbcd16a557e6abe1a5a7147f7efd" alt=""
-
In each cell of the Value field, press Ctrl+Space bar to access the list of global and user-specific variables.
-
For the Filename column, use the global variable: tFileList_1CURRENT_FILEPATH. It retrieves the current filepath in order to catch the name of each file, the Job iterates on.
-
For the Date column, use the Talend routine: Talend Date.getCurrentDate() (in Java)
-
Then on the tLogRow component view, select the Print values in cells of a table check box.
-
Save your Job and press F6 to execute it.
data:image/s3,"s3://crabby-images/6fef6/6fef68dd479242ac876d68a4574cc0f119c5b121" alt=""
The filepath displays on the Filename column and the current date displays on the Date column.