Skip to main content Skip to complementary content

Creating a connection to an ADLS Databricks cluster

Before you begin

About this task

To connect to a Databricks cluster on Amazon S3, follow this procedure Adding S3 specific properties to access the S3 system from Databricks.

Procedure

  1. In the DQ Repository tree view, expand Metadata and right-click DB Connections.
  2. Click Create DB connection.
    The Database Connection wizard is displayed.
  3. Enter a name and click Next. The other fields are optional.
  4. Select JDBC as the DB Type.
  5. In the JDBC URL field, enter the URL of your ADLS Databricks cluster. To get the URL:
    1. Go to Azure Databricks.
    2. In the clusters list, click the cluster to which you want to connect.
    3. Expand the Advanced Options section and select the JDBC/ODBC tab.
    4. Copy the content of the JDBC URL field. The URL format is jdbc:spark://<server-hostname>:<port>/default;transportMode=http;ssl=1;httpPath=<http-path>;AuthMech=3.
      Information noteNote: To encrypt the token in a safer way, it is recommended to enter the UID and PWD parameters in the Database Connection wizard of Talend Studio.
  6. Go back to the Database Connection wizard.
  7. Paste the JDBC URL.
  8. Add the JDBC driver to the Drivers list:
    1. Click the [+] button. A new line is added to the list.
    2. Click the […] button next to the new line. The Module dialog box is displayed.
    3. In the Platform list, select the JDBC driver and click OK. You are back to the Database Connection wizard.
  9. Click Select class name next to the Driver Class field and select com.simba.spark.jdbc4.Driver.
  10. Enter the User Id and Password.
  11. In Mapping file, select Mapping Hive.
  12. Click Test Connection.
    • If the test is successful, click Finish to close the wizard.
    • If the test fails, verify the configuration.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!