Use an Azure Data Lake Gen2 connection to create a dataset from a Databricks Delta
table, and use it in Talend Cloud Data Preparation.
Procedure
-
Click .
-
In the panel that opens, select the type of connection you
want to create.
Example
Azure Data Lake Storage Gen2
-
Select your engine
in the Engine list.
Information noteNote:
- It is recommended to use the Remote Engine Gen2 rather than
the Cloud Engine for Design for advanced
processing of data.
- If no Remote Engine Gen2 has been created from Talend Management Console or if it exists but appears as unavailable
which means it is not up and running, you will not be able to select
a Connection type in the list nor to
save the new connection.
- The list of available connection types depends on the engine you
have selected.
-
Select the type of connection you want to create.
Here, select Azure Data Lake Storage Gen2.
-
Fill in the connection properties to access your Azure Data Lake Storage Gen2
file system as described in Azure Data Lake Storage Gen2 properties, check the
connection and click Add dataset.
-
In the Add a new dataset panel, name your dataset.
Example
Databricks Delta table
-
Fill in the required properties to access the Delta table in your storage
account.
-
In the Format field, select
Delta.
-
Click View sample to see a preview of your dataset, and
click Validate to finalize the dataset creation.
-
To create a new preparation on the Databricks Delta
table, you can:
- From the Dataset list, point your mouse over the
dataset you want to use as source material for a preparation, click the
Talend Cloud Data Preparation icon and select Add to directly start working on
this data.
- From the preparations list, click the Add
preparation button. In the form that opens, give a name to
your preparation, select the source dataset that has been created beforehand
and click Submit.
Results
The preparation directly opens with an empty recipe, and you can start performing
preparation operations on your Databricks Delta dataset. The preparation will be created
in the folder in which you are currently working. Furthermore, your preparation will
automatically be saved in the preparations list, and all the changes you make when
preparing data are also saved automatically.