tAmazonEMRManage Standard properties
These properties are used to configure tAmazonEMRManage running in the Standard Job framework.
The Standard tAmazonEMRManage component belongs to the Cloud family.
The component in this framework is available in all Talend products.
Basic settings
Access key and Secret key |
Specify the access keys (the access key ID in the Access Key field and the secret access key in the Secret Key field) required to access the Amazon Web Services. For more information on AWS access keys, see Access keys (access key ID and secret access key). To enter the secret key, click the [...] button next to the secret key field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings. |
Inherit credentials from AWS role |
Select this check box to leverage the instance profile credentials. The credentials can be used on Amazon EC2 instances or AWS ECS, and are delivered through the Amazon EC2 metadata service. To use this option, your Job must be running within Amazon EC2 or other services that can leverage IAM Roles for access to resources. For more information, see Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances. |
Assume role |
If you temporarily need some access permissions associated to an AWS IAM role that is not granted to your user account, select this check box to assume that role. Then specify the values for the following parameters to create a new assumed role session. |
Action |
Select an action to be performed from the
list, either Start or Stop.
|
Region |
Specify the AWS region by selecting a region name from the list or entering a region between double quotation marks (for example "us-east-1"). For more information about how to specify the AWS region, see Choose an AWS Region. |
Cluster name |
Enter the name of the cluster. |
Cluster version |
Select the version of the cluster. You can also select the Customize Version and Application check box on the Advanced settings view to customize the cluster version information. This property is not available when the Customize Version and Application check box is selected. |
Application |
Select the applications to be installed on the cluster. You can also select the Customize Version and Application check box on the Advanced settings view to customize the applications information. This property is available when an EMR version is selected from the Cluster version list and the Customize Version and Application check box is cleared. |
Service role |
Enter the IAM (Identity and Access Management) role for the Amazon EMR service. The default role is EMR_DefaultRole. To use this default role, you must have already created it. |
Job flow role |
Enter the IAM role for the EC2 instances that Amazon EMR manages. The default role is EMR_EC2_DefaultRole. To use this default role, you must have already created it. |
Enable log |
Select this check box to enable logging and in the field displayed specify the path to a folder in an S3 bucket where you want Amazon EMR to write the log data. |
Use EC2 key pair |
Select this check box to associate an Amazon EC2 (Elastic Compute Cloud) key pair with the cluster and in the field displayed enter the name of your EC2 key pair. |
Predicate |
Specify the cluster(s) that you want to
stop:
This list is available only when Stop is selected from the Action list. |
Instance count |
Enter the number of Amazon EC2 instances to initialize. This field is available only if you select Start from the Action drop-down list in the Basic settings view and if the Use multiple master nodes check box is cleared. |
Slave instance count |
Enter the number of Amazon EC2 slave instances to initialize. This field is available only if you select Start from the Action drop-down list in the Basic settings view and if you select the Use multiple master nodes check box in the Advanced settings view. |
Master instance type |
Select the type of the master instance to initialize. |
Slave instance type |
Select the type of the slave instance to initialize. |
Advanced settings
STS Endpoint |
Select this check box to specify the AWS Security Token Service (STS) endpoint from which to retrieve the session credentials. For example, enter sts.amazonaws.com. This check box is available only when the Assume Role check box is selected. |
Signing region |
Select the AWS region of the STS service. If the region is not in the list, you can enter its name between double quotation marks. The default value is us-east-1. This drop-down list is available only when the Assume Role check box is selected. |
External Id |
If the administrator of the account to which the role belongs provided you with an external ID, enter its value here. The External Id is a unique identifier that allows a limited set of users to assume the role. This field is available only when the Assume Role check box is selected. |
Serial number |
When you assume a role, the trust policy of this role might require Multi-Factor Authentication (MFA). In this case, you must indicate the identification number of the hardware or virtual MFA device that is associated with the user who assumes the role. This field is available only when the Assume Role check box is selected. |
Token code |
When you assume a role, the trust policy of this role might require Multi-Factor Authentication (MFA). In this case, you must indicate a token code. This token code is a time-based one-time password produced by the MFA device. This field is available only when the Assume Role check box is selected. |
Tags |
List session tags in the form of key-value pairs. You can then use these session tags in policies to allow or deny access to requests. Transitive: select this check box to indicate that a tag will persist to the next role in a role chain. For more information about tags, see Passing Session Tags in AWS STS This field is available only when the Assume Role check box is selected. |
IAM Policy ARNs |
Enter the Amazon Resource Names (ARNs) of the IAM managed policies that you want to use as managed session policies. Use managed session policies to limit the permissions of the session. The policies must exist in the same account as the role. The resulting session's permissions are the intersection of the role's identity-based policy and the session policies. For more information about session policies, see the corresponding section in Policies and Permissions This field is available only when the Assume Role check box is selected. |
Policy |
Enter an IAM policy in JSON format that you want to use as a session policy. Use session policies to limit the permissions of the session. The resulting session's permissions are the intersection of the role's identity-based policy and the session policies. For more information about session policies, see the corresponding section in Policies and Permissions This field is available only when the Assume Role check box is selected. |
Wait for cluster ready |
Select this check box to let your Job wait until the launch of the cluster is completed. |
Visible to all users |
Select this check box to make the cluster visible to all IAM users. |
Termination Protect |
Select this check box to enable termination protection to prevent instances in the cluster from shutting down due to errors or issues during processing. |
Enable debug |
Select this check box to enable the debug mode. |
Customize Version and Application |
Select this check box to customize the version of the cluster and the applications to be installed on the cluster.
|
Use multiple master nodes |
Select this check box to enable high availability and
launch a cluster with multiple master nodes with the Amazon EMR distribution
in version 5.23 or later. Information noteImportant: If you select this check box, you must
specify the identifier of the Amazon VPC subnet in the Subnet id field.
|
Subnet id |
Specify the identifier of the Amazon VPC (Virtual Private Cloud) subnet where you want the Job flow to launch. |
Availability Zone |
Specify the availability zone for your cluster's EC2 instances. |
Master security group |
Specify the security group for the master instance. |
Additional master security groups |
Specify additional security groups for the master instance and separate them with a comma, for example, gname1, gname2, gname3. |
Slave security group |
Specify the security group for the slave instances. |
Additional slave security groups |
Specify additional security groups for the slave instances and separate them with a comma, for example, gname1, gname2, gname3. |
Service Access Security Group |
Specify the identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnet. For how to create a private subnet to enable service access security group on Amazon EMR, see Scenario 2: VPC with Public and Private Subnets (NAT). |
Actions |
Specify the bootstrap actions associated with the cluster, by clicking the [+] button below the table to add as many rows as needed, each row for a bootstrap action, and setting the following parameters for each action:
For more information about the bootstrap actions, see BootstrapActionConfig. |
Steps |
Specify the Job flow step(s) to be invoked on the cluster after its launch, by clicking the [+] button below the table to add as many rows as needed, each row for a step, and setting the following parameters for each step:
For more information about the Job flow steps, see StepConfig. |
Keep alive after steps complete |
Select this check box to keep the job flow alive after completing all steps. |
Wait for steps to complete |
Select this check box to let your Job wait until the Job flow steps are completed. This check box is available only when the Wait for cluster ready check box is selected. |
Properties |
Specify the classification and property information supplied to the configuration object of the EMR cluster to be created, by clicking the [+] button below the table to add as many rows as needed, each row for a property, and setting the following parameters:
This field is available only if you select Start from the Action drop-down list in the Basic settings view and if the Use multiple master nodes check box is cleared. |
Properties in JSON |
Enter, in JSON format, the classification and property information supplied to the configuration object of the EMR clusters to be created. This field is available only if you select Start from the Action drop-down list in the Basic settings view and if you select the Use multiple master nodes check box in the Advanced settings view. |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job level as well as at each component level. |
Global Variables
CLUSTER_FINAL_ID |
The ID of the cluster. This is an After variable and it returns a string. |
CLUSTER_FINAL_NAME |
The name of the cluster. This is an After variable and it returns a string. |
ERROR_MESSAGE |
The error message generated by the component when an error occurs. This is an After variable and it returns a string. |
Usage
Usage rule |
tAmazonEMRManage is usually used as a standalone component. |