Skip to main content Skip to complementary content

What's new in R2020-06

Big Data: new features

Feature

Description

Available in

Support for Cloudera Data Platform (CDP) When you configure a connection to a Hadoop cluster, you can select Cloudera CDP 7.1. You can also add and use the dynamic distributions of CDP Private Cloud Base 7.x.

The CDP integration in Talend Studio includes a new dependency management system that improves the performance of your Jobs at runtime.

CDP supports the following elements:
  • Data Integration components:
    • HBase
    • HDFS
    • Hive
  • Spark Batch components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kudu
  • Spark Streaming components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kafka

All Talend products with Big Data

Support of Microsoft HD Insight 4.0 You can now use the Microsoft HD Insight 4.0 distribution in Standard Jobs and in Spark Jobs that use Spark v2.3 and v2.4. This new support comes with several features:
  • Support of Azure Data Lake Storage (ADLS) Gen2: this storage option is available when you use Hive or HDFS, and to configure a connection with tAzureFSConfiguration. You can also use ADLS Gen2 as a primary storage when configuring a centralized connection to HD Insight in Metadata.
  • Support of TLS to secure connections to ADLS Gen2 and Azure Blob Storage.

All Talend products with Big Data

Check the status of Jobs that run on HD Insight To check if a Job is still running, configure a polling that retrieves the status of this Job. In the Spark Configuration tab in the Run view of the Job, in the Job status polling configuration section, specify the time period between polls and the maximum number of retries.

All Talend products with Big Data

Use Databricks pools You can reduce the start and auto-scaling times of your Databricks cluster by using a pool. In the Spark Configuration tab in the Run view of your Job, select the Use pool check box and indicate the ID of the pool that you want to use. You must also select the Use transient cluster check box. For more information about Databricks pools, see Pools in the Databricks documentation.

All Talend products with Big Data

Azure ADLS Gen2 components: Azure Active Directory authentication supported

The following Azure ADLS Gen2 components support the Azure Active Directory authentication (AD authentication).

  • tAzureAdlsGen2Input
  • tAzureAdlsGen2Output

All Talend products with Big Data

Data Integration: new features

Feature

Description

Available in

Further enhancement of context propagation The context propagation over the reference project has been further enhanced by improving the conflicts resolution for the Git/SVN technical files when merging branches.

All Talend products with Talend Studio

Microsoft SQL Server metadata wizard update The default Db Version for Microsoft SQL Server in Talend Studio metadata wizard is changed to Microsoft.

All Talend products with Talend Studio

Stitch connectors integration You can now search Stitch connectors on the design workspace and in the Palette in Talend Studio. The search result will lead you to the Stitch web page about the connector you select.

All Talend products with Talend Studio

tDataprepRun enhancement

The tDataprepRun component now supports the dynamic schema feature.

All Talend products with Talend Studio

New components available

This release provide the following two new components.

  • tCosmosDBSQLAPIInput, which retrieves data from a Cosmos database collection through SQL API.
  • tCosmosDBSQLAPIOutput, which inserts, updates, upserts or deletes documents in a Cosmos database collection based on the incoming flow from the preceding component through SQL API.

All Talend products with Talend Studio

Snowflake components: external OAuth support provided

The following Snowflake components support external OAuth for data accessing.

  • tSnowflakeBulkExec
  • tSnowflakeConnection
  • tSnowflakeInput
  • tSnowflakeOutput
  • tSnowflakeOutputBulk
  • tSnowflakeOutputBulkExec
  • tSnowflakeRow

All Talend products with Talend Studio

MS SQL Server connectors: the default JDBC provider changed to the official Microsoft driver

The default JDBC provider of the following components changed to the official Microsoft driver.

  • tCreateTable
  • tELTMSSqlMap
  • tMSSqlBulkExec, tMSSqlConnection, tMSSqlInput, tMSSqlOutput, tMSSqlOutputBulkExec, tMSSqlRow, tMSSqlSCD, tMSSqlSP, tMSSqlCDC, tMSSqlInvalidRows, tMSSqlValidRows

All Talend products with Talend Studio

tJDBCInput: new option provided to prevent unexpected character conversion in dynamic column

The tJDBCInput component provides the Allow special character in dynamic table name, which keeps special characters in input table column names as they are.

All Talend products with Talend Studio

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!