Setting general connection properties
This section describes how to configure general connection properties. For an explanation of how to configure advanced connection properties, see Setting advanced connection properties.
To define the general connection properties:
-
Click the Manage Endpoint Connections toolbar button.
The Manage Endpoints Connections dialog box opens.
-
Click the New Endpoint Connection toolbar button.
The Name, Description, Type and Role fields are displayed on the right.
- In the Name field, specify a display name for the endpoint.
- In the Description field, optionally type a description for the Kafka endpoint.
- Select Target as the endpoint Role.
-
Select Kafka as the endpoint Type.
The dialog box is divided into General and Advanced tabs.
-
In the Broker servers field, specify one or more broker servers using the following format (for high availability):
server1[:port1][,server2[:port2]]
Example:
192.168.1.100:9092,192.168.1.101:9093
Replicate will connect to the first available host. If a host is specified without a port then port 9092 will be used as the default.
Information noteWhen using SSL or Kerberos authentication, you must specify the broker FQDN (i.e. not the IP address).
Information noteAll of the broker servers in your cluster need to be accessible to Replicate. However, you do not need to specify all of the servers in the Broker servers field. This is because Replicate only need to connect to one of the servers in order to retrieve the connection details for the other servers in the cluster. It is therefore best practice to specify the servers that are most likely to be available when the task is run. The servers to which Replicate produces messages is determined by the topic and partitioning topic and partitioning settings described below.
- Continue as described below.
Security
- The Use SSL and Certificate authentication options are only supported from Kafka 0.9 or later.
- The CA file, public key file and private key file must all be in PEM format.
- The Kerberos and User name and password authentication methods are only supported from Kafka 0.10 or later.
-
All of the broker servers in the cluster must be configured to accept connection requests using the selected Authentication method.
In the Security section, the following options can be set:
-
Use SSL (supports TLS 1.0, 1.1 and 1.2): Select this option to encrypt the communication between the Replicate machine and the broker server(s). If the brokers are configured to require SSL, then you must select this option.
- CA path: Specify either the full path (i.e. including the file name) to a specific CA certificate in PEM format or the directory containing certificate files with hash names.
-
Authentication: Select one of the following:
- None - No authentication.
-
Certificate - If you select this option, you also need to provide the following information:
Note The public and private key files must be in PEM format.
- Public key file - The full path to the public key file on the Replicate Server machine.
- Private key file - The full path to the private key file on the Replicate Server machine.
- Private key password - The password for the private key file.
-
Kerberos (SASL/GSSAPI) - Select to authenticate against the Kafka cluster using Kerberos. Replicate automatically detects whether Qlik Replicate Server is running on Linux or on Windows and displays the appropriate settings.
Qlik Replicate Server on Linux:
- Principal - The Kerberos principal used to authenticate against the broker server(s).
- Keytab file - The full path to the keytab file (that contains the specified principal) on the Replicate Server machine.
Information noteIn order to use Kerberos authentication on Linux, the Kerberos client (workstation) package should be installed.
Qlik Replicate Server on Windows:
Information noteNote Both Replicate Server and the Kafka brokers must be connected to Active Directory KDC.
-
Realm - The name of the domain in which the broker servers reside.
-
Principal - The user name to use for authentication. The principal must be a member of the domain entered above.
- Password - The password for the principal entered above.
For additional steps required to complete setup for Kerberos authentication, see Using Kerberos Authentication on Windows.
-
Username and password (SASL/PLAIN) - You can select this option to authenticate yourself using a user name and password (SASL/PLAIN). To prevent the password from being sent in clear text, it is strongly recommended to enable the Use SSL option as well.
-
Username and Password (SASL/SCRAM-SHA-256) - You can select this option to authenticate yourself using a user name and password (SASL/SCRAM-SHA-256).
Note that selecting this option also requires each broker's server.properties file to be configured with the corresponding SASL/SCRAM mechanism.
-
Username and Password (SASL/SCRAM-SHA-512) - You can select this option to authenticate yourself using a user name and password (SASL/SCRAM-SHA-512).
Note that selecting this option also requires each broker's server.properties file to be configured with the corresponding SASL/SCRAM mechanism.
Message properties
In the Message Properties section, set the following properties:
-
Choose JSON or Avro as the message format.
Information noteQlik provides an Avro Message Decoder SDK for consuming Avro messages produced by Qlik Replicate. You can download the SDK as follows:
-
Go to Product Downloads.
-
Select Qlik Data Integration.
-
Scroll down the Product list and select Replicate.
-
In the Download Link column, locate the QlikReplicate_<version>_Avro_Decoder_SDK.zip file. Before starting the download, check the Version column to make sure that the version correlates with the Replicate version you have installed.
-
Proceed to download the QlikReplicate_<version>_Avro_Decoder_SDK.zip file.
For usage instructions, see Kafka Avro consumers API.
An understanding of the Replicate envelope schema is a prerequisite for consuming Avro messages produced by Qlik Replicate. If you do not wish to use the SDK, see The Qlik Envelope for a description of the Replicate envelope schema.
-
- From the Compression drop-down list, optionally select one of the available compression methods (Snappy or gzip). The default is None.
-
If you selected Avro, optionally select the Use logical data types for specific data types check box to map some of the number-based Qlik Replicate data types to Avro logical data types. When this option is not selected (the default), all Qlik Replicate data types will be mapped to Avro primitive data types.
For more information on Qlik Replicate to Avro data type mapping, see Mapping from Qlik Replicate Data Types to Avro.
-
If the message Format is set to Avro, Publish is set to Publish data schemas to Confluent Schema Registry or Publish data schemas to Hortonworks Schema Registry (see below), and the Message Key is not set to None, you can select the Encode message key in Avro format check box. When this option is not selected (the default), the message key will be in text format.
Information noteIf you are using the Confluent JDBC Sink Connector to consume messages, this option must be enabled.
Data message publishing
In the Data Message Publishing section, set the following properties:
-
In the Publish the data to field, choose one of the following:
- Specific topic - to publish the data to a single topic. Either type a topic name or use the browse button to select the desired topic.
-
Specific topic for each table - to publish the data to multiple topics corresponding to the source table names.
The target topic name consists of the source schema name and the source table name, separated by a period (e.g. "dbo.Employees"). The format of the target topic name is important as you will need to prepare these topics in advance.
Information noteIf the topics do not exist, configure the brokers with auto.create.topics.enable=true to enable Replicate to create the topics during runtime. Otherwise, the task will fail.
- From the Partition strategy drop-down list, field, select either Random or By message key. If you select Random, each message will be written to a randomly selected partition. If you select By message key, messages will be written to partitions based on the selected By message key (described below).
-
From the Message key drop-down list, field, select one of the following:
Information noteNote If the message Format is set to Avro and the Encode message key in Avro format option is enabled, the message key will be an Avro record with an Avro schema.
- None - To create messages without a message key.
-
Schema and table name - For each message, the message key will contain a combination of schema and table name (e.g. "dbo+Employees").
When By message key is selected as the Partition strategy, messages consisting of the same schema and table name will be written to the same partition.
-
Primary key columns - For each message, the message key will contain the value of the primary key column.
When By message key is selected as the Partition strategy, messages consisting of the same primary key value will be written to the same partition.
Metadata message publishing
In the Metadata Message Publishing section, specify whether or where to publish the message metadata.
From the Publish drop-down list, select one of the following options:
-
Do not publish metadata messages
When this option is selected, only the data messages will be published. Additionally, the Wrap data messages with the Replicate Envelope option (enabled by default) will be displayed. This option is useful for organizations that wish to leverage the Qlik Envelope structure to process the data messages. If you do not require the additional information provided by the Attunity Envelope (e.g. due to existing message consumption processes), then disable this option.
-
Publish metadata messages to a dedicated metadata topic
If you select this option, either type the Topic name or use the Browse button to select the desired topic. This option is required if the message format is set to Avro since Avro-formatted messages can only be opened using the Avro schema.
-
Publish data schemas to the Confluent Schema Registry
If you select this option, you must also configure the Schema Registry Connection Properties described below.
-
Publish data schemas to the Hortonworks Schema Registry
If you select this option, you must also configure the Schema Registry Connection Properties described below.
-
It is strongly recommended not to publish schema messages to the same topic as data messages.
-
If the topics do not exist, configure the brokers with auto.create.topics.enable=true to enable Replicate to create the topics during runtime. Otherwise, the task will fail.
-
The Confluent and Hortonworks Schema Registry options support Avro message format only.
Schema Registry connection properties
-
Schema Registry servers: Specify one or more Schema Registry servers using the following format (for high availability):
When publishing data schemas to the Confluent Schema Registry:
server1:port1[,server2[:port2]]
Example:
192.168.1.100:8081,192.168.1.101:8081
Replicate will connect to the first available host.
When publishing data schemas to the Hortonworks Schema Registry:
server1:port1[,server2[:port2]]
Example:
192.168.1.100:7788,192.168.1.101:7788
Replicate will connect to the first available host.
- Use SSL (supports TLS 1.0, 1.1 and 1.2): Select this option to encrypt the data between the Replicate machine and the Schema Registry server(s). If the servers re configured to require SSL, then you must select this option.
CA path: Specify one of the following:
- The full path (i.e. including the file name) to a specific CA certificate in PEM format
- The directory containing certificate files with hash names
-
Authentication - Select one of the following Schema Registry authentication options:
- None - No authentication.
-
Kerberos - Select to authenticate using Kerberos.
Information note-
This option is only supported when publishing data schemas to the Hortonworks Schema Registry and when Qlik Replicate Server is running on Linux.
-
In order to use Kerberos authentication on Linux, the Kerberos client (workstation) package should be installed.
-
- Principal - The Kerberos principal used to authenticate against the Schema Registry.
-
Keytab file - The full path to the keytab file (that contains the specified principal) on the Replicate Server machine.
-
Certificate - Select to authenticate using a certificate.
Information noteThis option is only supported when publishing to the Confluent Schema Registry.
If you select this option, you also need to provide the following information:
- Public key file - The full path to the public key file on the Replicate Server machine.
- Private key file - The full path to the private key file on the Replicate Server machine.
- Private key password - The password for the private key file.
-
User name and password - Select to authenticate with a user name and password. Then enter your login credentials in the Username and password fields.
Information noteThis option is only supported when publishing to the Confluent Schema Registry.
-
Certificate + User name and password - Select to authenticate using both a certificate and a user name and password.
When this option is selected, enter the required information in the Public key file, Private key file, Private key password, Username, and Password fields described above.
Information noteThis option is only supported when publishing to the Confluent Schema Registry.
-
Use proxy server - Select to publish to the Schema Registry via a proxy server.
Information noteThis option is only supported when publishing to the Confluent Schema Registry.
- Host name - The host name of the proxy server.
- Port - The port via which to access the proxy server.
- Scheme - Select which protocol to use to access the server (HTTP or HTTPS).
- SSL CA Path - The location of the CA file on the Replicate Server machine when HTTPS is the selected Scheme.
Schema Registry subject properties
Subject Name Strategy
- The first strategy (Schema and Table Name Strategy) is a proprietary Qlik strategy while the other three are standard Confluent subject name strategies.
-
For strategies with "Topic" in the subject name, the following should be considered:
-
When the "add $topic column" method is used, the subject will be created only once (as the $topic expression might create multiple subjects).
See also steps 3 and 4 in Overriding the default settings.
-
The "Metadata only" Advanced run option is not supported. This is because Replicate depends on the arrival of the first record per table in order to create the subject.
-
Select one of the available subject name strategies.
- Schema and Table Name - The default
- Topic Name
- Record Name
- Topic and Record Name - See also: Message Format.
For more information on Confluent's subject name strategies, see https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#subject-name-strategy
Subject Compatibility Mode
Select a compatibility mode from the Subject compatibility mode drop-down list. A description of the selected mode will appear below the drop-down list.
- Depending on the selected Subject Name Strategy, some of the compatibility modes may not be available.
-
When publishing messages to a Schema Registry, the default subject compatibility mode for all newly created Control Table subjects will be None, regardless of the selected Subject compatibility mode.
Should you wish the selected Subject compatibility mode to apply to Control Tables as well, set the setNonCompatibilityForControlTables internal parameter to false.