Skip to main content Skip to complementary content

Start an Amazon EMR cluster

In the Talend Studio, you create a new Standard Job to launch an Amazon EMR cluster.

Procedure

  1. Create a new Standard Job.
  2. Add a tAmazonEMRManage component and open the Component view.
  3. Provide your Amazon credentials to the Job.
    • If you have set up the context for these credentials in the previous steps, do the following:
      1. In the Context view of your Job, add the AmazonCredentials context that is stored in the Repository.
      2. In the Access Key and Secret key fields, use the context variables you created previously, context.AccessKey and context.SecretKey respectively.
    • If the security policy of your organization does not allow you to expose the credentials in a Job, select the Inherit credentials from AWS check box to obtain AWS security credentials from your EMR instance metadata. To use this option, the S3 system to be used must be S3A and an IAM role must has been configured to manage temporary credentials for client applications. For more information, see Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances.
  4. From the Action list, select Start. This list also allows you to select Stop to stop the Job.
  5. In the Region list, select the region to be used.
  6. Give a name to your cluster
  7. Select an EMR distribution with All Applications. This will allow you to work with Core Hadoop Services and with Spark as well.
  8. Select the Use EC2 key pair check box and provide your EC2 key pair name.
  9. In the Instance configuration area, specify the number of nodes you want. At runtime, one instance will be designated as the master and the others are designated as slaves. You can also specify the instance type for the master node and the slave nodes.
  10. Press F6 to run the Job.

Results

A new cluster is launched. You can verify it from the Amazon EMR home page:

You can also check the the status from the EC2 instances list:

In Talend Studio, the console in the Run view shows the following message:

Your cluster is now ready.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!