Double-click tHDFSGet to define the
component in its Basic settings
view.
Select, for example, Apache 0.20.2 from the Hadoop
version list.
In the NameNode URI, the
Username, the Group fields, enter the connection parameters to
the HDFS. If you are using WebHDFS, the location should be
webhdfs://masternode:portnumber; WebHDFS with SSL is not
supported yet.
In the HDFS directory field, type in
location storing the loaded file in HDFS. In this example, it is
/testFile.
Next to the Local directory field, click
the [...] button to browse to the
folder intended to store the files that are extracted out of the HDFS. In
this scenario, the directory is:
C:/hadoopfiles/getFile/.
Click the Overwrite file field to stretch
the drop-down.
From the menu, select always.
In the Files area, click the plus button
to add a row in which you define the file to be extracted.
In the File mask column, enter
*.txt to replace newLine
between quotation marks and leave the New
name column as it is. This allows you to extract all the
.txt files from the specified directory in the HDFS
without changing their names. In this example, the file is
in.txt.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!