Skip to main content Skip to complementary content

Write data to HDFS

In this example, use a Standard Job to write data to HDFS.

Procedure

  1. In the Studio, create an empty Standard Job.
  2. In a Standard Job, add a tRowGenerator component to generate random data or add a component to read a file saved in your EC2 instance.
  3. Drag the HDFS metadata from the Repository to the design space and select tHDFSOutput.
  4. In the Component view of this component, enter the directory where you want to write your data on HDFS.
  5. Press F6 to run the Job.
  6. If you need to read data from HDFS, you can also drag the HDFS metadata from the Repository to the design space and select tHDFSInput. Once again, you will have to specify the path to the file to be read. You must also provide the schema of this. Once done, you can right-click the tHDFSInput component and click Data viewer to have a preview of the content of the file.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!