Skip to main content Skip to complementary content

Accessing files on a Hadoop cluster from your engine

Before you begin

  • Make sure you use a recent version of docker-compose in order to avoid issues of volumes not correctly mounted.
  • Contact your system administrator to get the list of the complete set of Hadoop configuration files (core-site.xml, hdfs-site.xml, etc.).
  • Put these Hadoop configuration files in a folder on your local machine and copy its path.

Procedure

  1. Go to the following folder in the Remote Engine Gen2 installation directory:
    default if you are using the engine in the AWS USA, AWS Europe, AWS Asia-Pacific or Azure regions.

    eap if you are using the engine as part of the Early Adopter Program.

  2. Create a new file and name it:
    docker-compose.override.yml
  3. Edit this file to add the following:
    version: '3.6'
    
    services: 
    
      livy: 
        environment: 
          HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config
        volumes: 
          - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config
       
      component-server: 
        environment: 
          HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config
        volumes: 
          - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config

    where YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER corresponds to the path to the local folder where your Hadoop configuration files are stored.

  4. Save the file to take your changes into account.
  5. Restart your Remote Engine Gen2.
  6. Connect to Talend Cloud Pipeline Designer.
  7. Go to the Connections page and add a new HDFS connection using your engine and your local user name.
    Adding new HDFS connection.
  8. Add a new HDFS dataset using the new connection and make sure you use the path to your files (for example hdfs://namenode:8020/user/talend/files).
    Adding a new HDFS dataset.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!