Skip to main content Skip to complementary content

Amazon S3 properties

Properties to configure to establish a connection to S3 in your AWS account.

Amazon S3 connection

Select Amazon S3 in the list and configure the connection.

Configuration

Select your engine from the list and set the main and advanced settings.

Main settings
Property Configuration
Specify credentials Toggle this option OFF if your authentication type does not require credentials. By default, the authentication with credentials is enabled.
AWS access key Enter the Access Key ID that uniquely identifies your AWS Account. To know how to get your Access Key and Access Secret, read Getting Your AWS Access Keys.
AWS secret key Enter the Secret Access key that uniquely identifies your AWS Account. To know how to get your Access Key and Access Secret, read Getting Your AWS Access Keys.
Use STS Toggle this option ON to enable the AWS Security Token Service and create a new assumed role session:
  • In the ARN Role field, enter the Amazon Resource Name (ARN) of the role to assume.
  • In the Role session name field, enter an identifier for the assumed role session.
  • In the Signing region list, select the AWS region of the STS service.
  • (Optional) Toggle the Specify STS endpoint ON to specify the AWS Security Token Service endpoint where the session credentials are retrieved from.
  • (Optional) Toggle the Specify external ID ON to enter the external ID of a third-party that will access to your AWS resources. For more information, see the Amazon documentation.

In order to pass the connection check, you need to make sure one of these two methods are activated on S3: s3:ListAllMyBuckets, or s3:GetBucketLogging.

After configuring the connection, give it a display name (mandatory) and a description (optional).

Amazon S3 dataset

Dataset configuration
Property Configuration
Dataset name Enter a display name for the dataset. This name will be used as a unique identifier of the dataset in all Talend Cloud apps.
Connection Select your connection in the list. If you are creating a dataset based on an existing connection, this field is read-only.
S3 data settings
Property Configuration
AWS bucket name Select or enter the name of your Amazon S3 bucket.
Object name Select or enter the path of the object (file) to be retrieved.
Encrypt data at rest Enable this option to enable data encryption and enter your KMS master key.
Format configuration
Property Configuration
Auto detect Click this button to automatically detect the format of the data to be retrieved.
Format Alternatively, select in the list the format of the file to be retrieved and enter or select the information related to this file format:
  • CSV:
    • Record delimiter: Select the type of record separator used in the file to be retrieved. If you select Other, you will be able to enter a custom record delimiter in the Custom record delimiter field.
    • Field delimiter: Select the type of field separator used in the file to be retrieved. If you select Other, you will be able to enter a custom record delimiter in the Custom field delimiter field.
    • Text enclosure character: Enter the character used to enclose the fields.
    • Escape character: Enter the character to be escaped in the records to be retrieved.
    • Encoding: Select the type of encoding used in the file to be retrieved. If you select Other, you will be able to enter a custom encoding type in the Custom encoding field.
    • Set header: Enable this option if the file to be retrieved contains header lines and enter or select the number of lines to be skipped in the schema.
  • Excel:
    • Excel format: Select the format/version corresponding to the file to be retrieved.
    • Sheet: Enter the name of the specific Excel sheet you want to be retrieved.
    • Set header/footer: enable these options if the file to be retrieved contains header and/or footer lines and enter or select the number of lines to be skipped in the schema.
  • Avro: No specific parameters required for this format.
  • Parquet: No specific parameters required for this format.
  • JSON: No specific parameters required for this format.
Information noteNote: If you have a CSV dataset on S3 and want to use it with a Type Converter processor in your pipeline, you need first to define a quote (") in the Text Enclosure Character field when creating the S3 dataset otherwise an error will be thrown.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!