tAzureSynapseBulkExec Standard properties
These properties are used to configure tAzureSynapseBulkExec running in the Standard Job framework.
The Standard tAzureSynapseBulkExec component belongs to two families: Cloud and Databases.
The component in this framework is available in all Talend products.
Basic settings
Property Type |
Select the way the connection details will be set.
|
Use an existing connection |
Select this check box and in the Component List drop-down list, select the desired connection component to reuse the connection details you already defined. When a Job contains the parent Job and the child Job, if you need to share an existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to:
For an example about how to share a database connection across Job levels, see Talend Studio User Guide. |
JDBC Provider |
Select the provider of the JDBC driver to be used. |
Host |
Enter the IP address or the hostname of the database server or the Azure Synapse Analytics to be used. If the SQL Server Browser service is running on the machine where the server resides, you can connect to a named instance through a TCP dynamic port by providing the host name and the instance name in this field in the format of {host_name}\{instance_name}. In this case, you can leave the Port field empty. See SQL Server Browser service for related information. |
Port |
Enter the listening port number of the database server or the Azure Synapse Analytics to be used. If the SQL Server Browser service is running on the machine where the server resides, you can connect to a named instance through a TCP dynamic port by providing the host name and the instance name in the Host field and leave this field empty. See SQL Server Browser service for related information. |
Schema |
Enter the name of the Azure Synapse Analytics schema. |
Database |
Specify the name of the Azure Synapse Analytics to be used. |
Username and Password |
Enter the authentication data. To enter the password, click the [...] button next to the Password field, enter the password in double quotes in the pop-up dialog box, and then click OK. You can use Type 2 integrated authentication on Windows by adding integratedSecurity=true in the Additional JDBC Parameters field and leave these two fields empty. See section Connecting with integrated authentication On Windows at Building the connection URL for related information. |
Additional JDBC Parameters |
Specify additional connection properties for the database connection you are creating. The properties are separated by semicolon and each property is a key-value pair. For example, encrypt=true;trustServerCertificate=false; hostNameInCertificate=*.database.windows.net;loginTimeout=30; for Azure SQL database connection. |
Table |
Specify the name of the SQL Data Warehouse table into which data will be loaded. |
Action on table |
Select an operation to be performed on the table defined.
|
Schema and Edit schema |
A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word line when naming the fields.
Click Edit
schema to make changes to the schema.
Information noteNote: If you
make changes, the schema automatically becomes built-in.
|
Load method |
Set the way for loading data. Two options are provided: Copy and PloyBase. See Data loading strategies for Synapse SQL pool for related information. Information noteNote: This option is available
only if you have installed the R2020-05 Studio Monthly update or a later
one delivered by Talend. For more information, check with your
administrator.
|
Azure Storage |
Select the type of the Azure Storage from which data will be loaded. You can select Blob Storage, Data Lake Storage Gen1, or Data Lake Storage Gen2. Note that the Data Lake Storage Gen1 option is not available if Copy is selected from the Load method drop-down list. Information noteNote: The Data Lake Storage
Gen2 option is available only if you have installed the
R2020-05 Studio Monthly update or a later one delivered by Talend. For
more information, check with your administrator.
|
Authentication method |
Select one of the
following authentication method from the drop-down list.
This option is available when Copy is selected from the Load method drop-down list. Information noteNote: This option is available
only if you have installed the R2020-05 Studio Monthly update or a later
one delivered by Talend. For more information, check with your
administrator.
|
Account Name |
Enter the account name for your Azure Blob Storage or Azure Data Lake Storage to be accessed. |
Access key |
Enter the key associated with the storage account you need to access. Two keys are available for each account and by default, either of them can be used for this access. Select the component whose connection details will be used to set up the connection to Azure storage from the drop-down list. This property is available when Blob Storage is selected from the Azure Storage drop-down list. |
SAS token |
Enter the SAS token value. For related information, see Constructing the Account SAS URI. This option is only available to the Shared access signatures option of the Authentication method drop-down list. Information noteNote: This option is available
only if you have installed the R2020-05 Studio Monthly update or a later
one delivered by Talend. For more information, check with your
administrator.
|
Endpoint suffix |
Enter the Azure Storage service endpoint. The combination of the account name and the Azure Storage service endpoint forms the endpoint of the storage account. The default value varies with the Azure Storage setting, that is, "dfs.core.windows.net" for Data Lake Storage Gen2 and "blob.core.windows.net" for Blob Storage. This field is only available to the Copy option of the Load method drop-down list. Information noteNote: This option is available
only if you have installed the R2020-05 Studio Monthly update or a later
one delivered by Talend. For more information, check with your
administrator.
|
Container |
Enter the name of the blob container. This property is available when Blob Storage is selected from the Azure Storage drop-down list. |
External paths |
Enter external paths to load data from. You can add multiple external paths by clicking the plus button on the bottom. Note that the external paths are relative paths with the container as the root. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view. Information noteNote: This option is available
only if you have installed the R2020-05 Studio Monthly update or a later
one delivered by Talend. For more information, check with your
administrator.
|
Client Id |
Enter your application ID (also called client ID). This field is available when Data Lake Storage Gen1 is selected from the Azure Storage drop-down list. |
OAuth 2.0 token endpoint |
In the Token endpoint field, copy-paste the OAuth 2.0 token endpoint that you can obtain from the Endpoints list accessible on the App registrations page on your Azure portal. This field is available when Data Lake Storage Gen1 is selected from the Azure Storage drop-down list. |
Azure Storage Location |
Specify the location where your Azure Blob Storage or Azure Data Lake Storage account is created. |
Advanced settings
Secure transfer required |
Select this option to use secure transfer for Azure storage connection. This option is available when Data Lake Storage Gen2 is selected from the Azure Storage drop-down list and PolyBase is selected from the Load method drop-down list in the Basic settings view. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Authenticate using Azure Active Directory |
Select this option to use Azure Active Directory authentication when establishing the connection. See Azure AD Authentication for related information. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
File format |
Select the file format that defines external data stored in your Azure Blob Storage or Azure Data Lake Storage, Delimited Text, Hive RCFile, Hive ORC, or Parquet. This option is available when PolyBase is selected from the Load method drop-down list in the Basic settings view. For more information about the file format, see CREATE EXTERNAL FILE FORMAT. |
File type |
Select the file type that defines external data stored in your Azure Blob Storage or Azure Data Lake Storage, CSV, ORC, or Parquet. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view. For more information about the file format, see CREATE EXTERNAL FILE FORMAT. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Specify map to source table fields |
Select this option to map the fields of the source file to specific schema columns by entering schema columns and field indexes in the table that appears. Note that the field indexes of the source file need to be given in an ascendant order in the table. If this option is not selected, the fields of the source file will be mapped to the schema columns in the default order. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
First row |
Set the number of rows in the source file to be treated as the header. The default is 1. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Field quote |
Specify the character that will be used as the quote character (string delimiter) in the source file. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Field terminator |
Specify the field terminator, which indicates the end of each field in the source file. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Row terminator |
Specify the row terminator, which indicates the end of each row in the source file. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Date format |
Specify the date format. The default is Session date format. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Encoding |
Select the encoding from the list, either UTF8 or UTF16. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV option is selected from the File type drop-down list. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Identity insert |
Select this option to insert the identity values in the source file to the identity column. For related information, see COPY (Transact-SQL). This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view. Information noteNote: This option is available only if
you have installed the R2020-05 Studio Monthly update or a later one
delivered by Talend. For more information, check with your
administrator.
|
Max errors |
Enter the maximum number of errors allowed that will not stop the process. This field is only available when Copy is selected from the Load method drop-down list in the Basic settings view. |
Field separator |
Specify the character(s) that indicate the end of each field in the delimited text file. This property is available when PolyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list. |
Enclosed by |
Select this check box and in the field next to it, specify the character that encloses the string in the delimited file. This property is available when PolyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list. |
Date format |
Select this check box and in the field next to it, specify the custom format for all date and time data in the delimited file. For more information about the date format, see CREATE EXTERNAL FILE FORMAT. This property is available when PolyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list. |
Use type default |
Select this check box to store each missing value using the default value of the data type of the corresponding column. Clear this check box to store each missing value in the delimited file as NULL. This property is available when PolyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list. |
Serde Method |
Select a Hive serializer and deserializer method. This property is available when Hive RCFile is selected from the File format drop-down list. |
Compressed by |
Select this check box if external data is compressed, and from the drop-down list displayed next to it, select the compression method. |
Data import reject options |
Select this check box to specify the following reject options.
This option is only available to the PolyBase option of the Load method drop-down list. For more information about the reject options, see CREATE EXTERNAL TABLE. |
Distribution Option |
Select the sharding pattern used to distribute data in the table, Round Robin, Hash, or Replicate. For more information about the sharding pattern supported by Azure Synapse Analytics, see Azure Synapse Analytics - Massively parallel processing (MPP) architecture. This property is available when any option related to table creation is selected from the Action on table drop-down list. |
Distribution Column Name |
The name of the distribution column for a hash-distribution table. This property is available when Hash is selected from the Distribution Option drop-down list. |
Table Option |
Select the index type of the table, Clustered Columnstore Index, Heap, or Clustered Index. For more information, see Indexing tables in SQL Data Warehouse. This property is available when any option related to table creation is selected from the Action on table drop-down list. |
Index column(s) |
Specify the name of one or more key columns in the index. If multiple columns are specified, separate them with comma. This property is available when Clustered Index is selected from the Table Option drop-down list. |
Partition |
Select this check box to specify the following partition options:
For more information about the table partition, see Partitioning tables in SQL Data Warehouse. This property is available when any option related to table creation is selected from the Action on table drop-down list. |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job level as well as at each component level. |
Global Variables
ERROR_MESSAGE |
The error message generated by the component when an error occurs. This is an After variable and it returns a string. |
NB_LINE_INSERTED |
The number of rows inserted. This is an After variable and it returns an integer. |
Usage
Usage rule |
This component can be used as a standalone component of a Job or subJob. |
Limitation |
Note that some features that are supported by other databases are not supported by Azure Synapse Analytics. For more information, see Unsupported table features. |