Skip to main content

Setting general connection properties

This section describes how to configure general connection properties. For an explanation of how to configure advanced connection properties, see Setting advanced connection properties.

To configure the Cloudera Iceberg target endpoint in Qlik Replicate:

  1. In the Qlik Replicate console, click Manage Endpoint Connections to open the Manage Endpoint Connections dialog box.

    For more information on adding an endpoint to Qlik Replicate, see Defining and managing endpoints.

  2. In the Name field, type a name for your endpoint. This can be any name that will help to identify the endpoint being used.
  3. In the Description field, type a description that helps to identify the Cloudera Iceberg endpoint. This is optional.
  4. Select Cloudera Iceberg as the endpoint Type.
  5. In the Security section, enter the following settings:

    1. To encrypt the data between the Replicate machine and HDFS, select Use SSL. In order to use SSL, first make sure that the SSL prerequisites described in Prerequisites have been met.

      In the CA path field, either enter the path to a specific CA certificate on the Replicate machine, or use the Browse button to upload the file contents into the endpoint settings. Using the Browse button eliminates the need to access files on the Replicate machine, which is recommended for better security. If you use the Browse button, the file content will be stored as base64-encoded data in this field.

    2. Select one of the following authentication types:

      • Username and password - Select to connect to the Cloudera Iceberg NameNode or to the Knox Gateway (when enabled - see below) with a user name and password. Then, in the User ame and Password fields, specify the required user name and password.

      • Kerberos - Select to authenticate against the Cloudera Iceberg cluster using Kerberos. Replicate automatically detects whether Qlik Replicate Server is running on Linux or on Windows and displays the appropriate settings.

        Qlik Replicate Server on Linux:

        Information note

        Replicate uses MIT KDC for Kerberos authentication on Linux.

        When Qlik Replicate Server is running on Linux, provide the following information:

        • Realm: The name of the realm in which your Cloudera Iceberg cluster resides.

          For example, if the full principal name is john.doe@EXAMPLE.COM, then EXAMPLE.COM is the realm.

        • Principal: The user name to use for authentication. The principal must be a member of the realm entered above.

          For example, if the full principal name is john.doe@EXAMPLE.COM, then john.doe is the principal.

        • Keytab file: Either enter the full path of the Keytab file on the Replicate machine, or use the Browse button to upload the file contents into the endpoint settings. Using the Browse button eliminates the need to access files on the Replicate machine, which is recommended for better security. If you use the Browse button, the file content will be stored as base64-encoded data in this field. The Keytab file should contain the key of the Principal specified above.

        Qlik Replicate Server on Windows:

        When Qlik Replicate Server is running on Windows, provide the following information for accessing your Active Directory KDC:

        Information note

        When the Replicate KDC and the Cloudera Iceberg KDC are in different domains, a relationship of trust must exist between the two domains.

        • Realm: The name of the realm/domain in which your Cloudera Iceberg cluster resides (where realm is the MIT term while domain is the Active Directory term).
        • Principal: The user name to use for authentication. The principal must be a member of the realm/domain entered above.
        • Password: The password for the principal entered above.

        If you are unsure about any of the above, consult your IT/security administrator.

        For additional steps required to complete setup for Kerberos authentication, see Using Kerberos authentication on Windows.

  6. Replicate connects to Apache Impala for metadata operations. In the ODBC Access section, enter the following settings:

    • Impala host: The host name of IP address of your Apache Impala host.

    • Port: The port number of you Apache Impala host.

    • Database: The database name of you Apache Impala host.

  7. In the Data Loading section, enter the following settings:

    • In the WebHDFS NameNode field, specify the IP address or hostname of the NameNode.

    • Replicate supports replication to an HDFS High Availability cluster. In such a configuration, Replicate communicates with the Active node, but switches to the Standby node in the event of failover. To enable this feature, select the High Availability check box. Then, specify the FQDN (Fully Qualified Domain Name) of the standby NameNode in the Standby NameNode field.

    • In the Port field, optionally change the default port (9870).
    • In the Staging directory field, specify where to create the data files on HDFS.

      Information note

      The Staging directory name can only contain ASCII characters.

    • Maximum file size (MB): The maximum size a file can reach before it is loaded to the target. If you encounter performance issues, try adjusting this parameter.

    • Number of files to load per batch: Relevant for Full Load only. If you encounter performance issues, try adjusting this number. The default value is 30. Note that increasing the number may not necessarily improve performance and might even degrade it as the larger the number of files, the heavier the load on the server machine.

    • Batch load timeout (seconds): If you encounter frequent timeouts when loading the files, try increasing this value.

    Information note

    To verify that the connection information you entered is correct, click Test Connection.

    If the connection is successful a success message will be displayed. If the connection fails, an error message will be shown at the bottom of the dialog.

    To view the log entry if the connection fails, click View Log. The server log is displayed with the information about the connection failure. Note that this button is not available unless the test connection fails.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!