Skip to main content Skip to complementary content

Setting general connection properties

This section describes how to configure general connection properties. For an explanation of how to configure advanced connection properties, see Setting advanced connection properties.

To add a Databricks Lakehouse (Delta) target endpoint to Qlik Replicate:

  1. In the Qlik Replicate console, click Manage Endpoint Connections to open the Manage Endpoint Connections dialog box.

    For more information on how to add an endpoint connection, see Defining and managing endpoints.

  2. In the Name field, specify a name for your endpoint.
  3. In the Description field, provide information that helps identify the endpoint. This field is optional.
  4. Select Databricks Lakehouse (Delta) as the endpoint Type.
  5. In the Databricks ODBC Access section, provide the following information:

    1. In the Host field, specify the host name of the Databricks workspace.
    2. In the Port field, specify the port via which to access the workspace.
    3. Authentication: Select one of the following:

      • Personal Access Token: In the Token field, enter your personal token for accessing the workspace.
      • OAuth: Provide the following information:
        • Client ID: The client ID of your application.

        • Client secret: The client secret of your application.

        Information noteTo use OAuth authentication, your Databricks database must be configured to use OAuth. For instructions, see the vendor's online help.
    4. In the HTTP Path field, specify the path to the cluster being used.
    5. If you want the tables to be created in Unity Catalog, select Use Unity Catalog and then specify the Catalog name.

      Information note

      When the Use Unity Catalog option is selected, you need to allow Replicate to access external (unmanaged) tables by defining an external location in Databricks. For guidelines, see:

      https://docs.databricks.com/data-governance/unity-catalog/manage-external-locations-and-credentials.html#manage-permissions-for-an-external-location

      Note that even if you select the Managed tables option below, you still need to define an external location. This is because the Replicate Net Changes table is always created as unmanaged (external).

    6. In the Database field, specify the name of the Databricks target database.
    7. Choose whether to create tables as Managed tables or Unmanaged tables. If you selected Unmanaged tables, specify the location for your tables in the Location for tables field.

      Information note

      When creating unmanaged tables:

      • Tables are created using CREATE or REPLACE
      • Tables are not dropped, they are truncated. Therefore, if a table is dropped in the source during CDC, the corresponding target table in Databricks will be truncated, not dropped.

      For more information on managed versus unmanaged tables, see https://docs.databricks.com/lakehouse/data-objects.html

  6. For the Staging section, see Storage types below.

Storage types

Choose one of the available storage types and provide the required information, as described below.

Microsoft Azure Data Lake Storage (ADLS) Gen2

For Microsoft Azure Data Lake Storage (ADLS) Gen2, provide the following information:

  1. From the Storage type drop-down list, select Microsoft Azure Data Lake Storage (ADLS) Gen2.
  2. In the Storage account field, specify the name of your storage account.

    Information note

    To connect to an Azure resource on Government Cloud or China Cloud, you need to specify the full resource name of the storage account. For example, assuming the storage account is "myaccount", then the resource name for China Cloud would be myaccount.dfs.core.chinacloudapi.cn

    In addition, you also need to specify the login URL using the adlsLoginUrl internal parameter. For China Cloud, this would be https://login.chinacloudapi.cn

    For information on setting internal parameters, see Setting advanced connection properties

  3. In the Azure Active Directory Tenant ID field, specify the Azure active directory tenant ID.
  4. In the Application Registration Client ID field, specify the application registration client ID.
  5. In the Application Registration Secret field, specify the application registration secret.
  6. In the Container field, specify the name of your container.
  7. In the Staging directory field, specify where to create the data files on ADLS.

    Information note

    The Staging directory name can only contain ASCII characters.

Amazon S3

For Amazon S3, provide the following information:

  1. From the Storage type drop-down list, select Amazon S3.
  2. In the Bucket name field, specify the name of your Amazon S3 bucket..

  3. In the Bucket region field, specify the region where your bucket is located. It is recommended to leave the default (Auto-Detect) as it usually eliminates the need to select a specific region. However, due to security considerations, for some regions (for example, AWS GovCloud) you might need to explicitly specify the region. To do this, select Other and specify the code in the Region code field.

    For a list of region codes, see the Region availability section in:

    https://docs.aws.amazon.com/general/latest/gr/s3.html

  4. In the Access options field, choose one of the following:
    • Key pair

      Choose this method to authenticate with your Access Key and Secret Key.

    • IAM Roles for EC2

      Choose this method if the machine on which Qlik Replicate is installed is configured to authenticate itself using an IAM role.

      For more information about this access option, see:

      http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html

  5. In the Access key field, specify the access key information for Amazon S3.
    Information noteThis option is only available when Key pair is the access option.
  6. In the Secret key field, specify the secret key information for Amazon S3.
    Information noteThis option is only available when Key pair is the access option.
  7. In the Staging directory field, specify where to create the data files on Amazon S3.

    Information note

    The Staging directory name can only contain ASCII characters.

Google Cloud Storage

For Google Cloud Storage, provide the following information:

  1. From the Storage type drop-down list, select Google Cloud Storage.
  2. In the JSON credentials field, specify the JSON credentials for the service account key used to access the Google Cloud Storage bucket.
  3. In the Bucket name field, specify the name of the bucket in Google Cloud Storage where you want the data files to be written. This must be the same as the bucket you configured for your Databricks cluster.

  4. In the Staging directory field, specify where to create the data files in the specified bucket.

    Information note

    The Staging directory name can only contain ASCII characters.

Databricks Volume

Prerequisites

Make sure the following prerequisites have been met:

  • At least one volume exists in Unity Catalog.

    For information on creating a volume, see CREATE VOLUME.

  • The following permissions are granted to Replicate:

    • READ VOLUME

    • WRITE VOLUME

  • Use Unity Catalog is selected in the endpoint settings, and a catalog name is specified.

Endpoint settings

Once you have fulfilled the preregister, provide the following information in the endpoint settings:

  1. From the Storage type drop-down list, select Databricks Volume.
  2. In the Volume name field, specify the volume name. You can either type the volume name or click Browse to select one.
  3. In the Staging directory field, specify where to create the data files in the specified volume. You can either type the directory name or click Browse to select one.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!