Setting up the Job
Procedure
-
Open the Basic settings view of the
tSambaConnection component by double-clicking the component and
do the following.
- Enter the IP address of the Samba host in the host field;
- If user authentication is enabled in the Samba host, enter user name and password in the username and password fields; otherwise, leave these two fields empty;
-
Enter the domain name in the Domain field.
If the Samba host is not configured with a domain, leave this field empty.
-
Open the Basic settings view of the
tSambaList component by double-clicking the component and do
the following.
- Select Use an existing connection and select the tSambaConnection component from the Component List drop-down list;
- Enter the name of the shared folder set in the Samba host in the Share directory field (SmbShare in this example);
- Enter the path to the directory whose files you want to process in the Remote path field (/abc in this example);
-
Click the Guess schema button and then click
OK to accept the schema generated.
The schema generated contains seven columns, as shown in the following figure. You can edit the schema based on your actual needs by removing undesired columns;Information noteNote:
- The tSambaList component does not pass data to any columns other than the seven columns.
- The FileName_with_Path column is used by the subsequent components in this scenario. Make sure the column is present in the schema.
- For data that is of Date type, information about hours, minutes, seconds, and so on is hidden in Talend Studio by default. You can have such information displayed by changing the setting in the Date Pattern column (as shown in the Change_Time row in the following figure). You can access a list of all the supported date patterns by clicking the Date Pattern field of a row that is of Date type and pressing Ctrl + Space.
- Select Includes subdirectories if you want to process the CSV files in all the directories under the Smbshare/abc directory;
- Add a row in the File mask field by clicking the plus button on the bottom of the field and enter "*.csv" in the row.
-
Open the Basic settings view of the
tLogRow component by double-clicking the component and do the
following.
- Click the Sync columns button to synchronize the schema with that of the tSambaList component;
- Select Table (print values in cells of a table);
- Leave the other options as they are.
-
Open the Basic settings view of the
tFlowToIterate component by double-clicking the component and
do the following.
- Clear the Use the default (key, value) in global variables check box;
-
Add a row in the Customize table by clicking the plus
button on the bottom of the field; enter
"CURRENT_FILE_PATH" in the key
column and select FileName_with_Path from the
value column;
This row creates a global variable named CURRENT_FILE_PATH, which holds the path to the current file.
- Leave the other options as they are.
-
Open the Basic settings view of the
tSambaDelete component by double-clicking the component and do
the following.
- Select Use an existing connection and select the tSambaConnection component from the Component List drop-down list;
- Enter the name of the shared folder set in the Samba host in the Share directory field (SmbShare in this example);
-
Enter a string to retrieve the paths to the files to be removed
(((String)globalMap.get("CURRENT_FILE_PATH")) in this
example) in the Remote path field;
Information noteNote: You can also enter the string by placing the cursor in the Remote path field, pressing Ctrl + Space, and then selecting tFlowToIterate_1.CURRENT_FILE_PATH from the list that appears.
- Select the Remove directory check box if you also want to remove the directory specified in the Remote path field;
- Leave the other options as they are.
- Save the Job.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!