You can create a Hadoop cluster metadata definition to be able to quickly configure component with your Hadoop cluster information. Talend Studio also
allows you to import a cluster metadata definition.
Before you begin
- This tutorial makes use of a Hadoop cluster. You
must have a Hadoop cluster available to you.
- Select the Integration perspective ().
Procedure
-
In the Repository, expand Metadata, right-click
Hadoop Cluster and click
Create Hadoop Cluster.
-
In the Name field,
enter a name.
Example
MyHadoopCluster
- Optional:
In the Purpose
field, enter a purpose.
Example
Cluster connection metadata
- Optional:
In the
Description field, enter a description.
Example
Metadata to connect to a Amazon EMR
cluster
Information noteTip: Enter a Purpose
and Description to stay organized.
-
Click Next.
-
Select a Distribution.
Example
Select
Amazon EMR and EMR 5.15.0 (Hadoop
2.8.3).
-
Select a Version.
Example
Select
EMR 5.15.0 (Hadoop 2.8.3).
-
Select
Enter manually Hadoop services.
-
Click Finish.
You are brought to the Hadoop Cluster Connection
window.
-
Enter your Connection details.
Example
- Namecode URI:
hdfs://hadoopcluster:8020
- Resource Manager:
hadoopcluster:8032
- Resource Manager Scheduler:
hadoopcluster:8030
- Job History:
hadoopcluster:10020
- Staging directory:
/user
-
Enter your Authentication details.
Example
- Optional:
Click
Check Services.
-
Click Finish.
Results
The Hadoop cluster metadata definition appears in the
Repository.