Component-specific settings for tKafkaInput
The following table describes the Job script functions and parameters that you can define in the setSettings {} function of the component.
Function/parameter | Description | Mandatory? |
---|---|---|
OUTPUT_TYPE |
Specify the type of data to be sent to the next component.
Typically, giving this parameter a value STRING is recommended, because tKafkaInput can automatically translate the Kafka byte[] messages into strings to be processed by the Job. However, in case that the format of the Kafka messages is not known to tKafkaInput, such as Protobuf, you can use BYTES and then use a Custom code component such as tJavaRow to deserialize the messages into strings so that the other components of the same Job can process these messages. |
No |
USE_EXISTING_CONNECTION |
Set this parameter to true and specify the name of the relevant connection component using the CONNECTION parameter to reuse the connection details you already defined. |
No |
KAFKA_VERSION |
Specify the version of the Kafka cluster to be used. Acceptable values:
|
Yes |
ZOOKEEPER_CONNECT |
Specify the address of the ZooKeeper service of the Kafka cluster to be used, in the form of "\"zk1:port1,zk2:port2,...\"". This parameter works only when the Kafka cluster version is Kafka 0.8.2.0. |
Yes |
BROKER_LIST |
Specify the addresses of the broker nodes of the Kafka cluster to be used, in the form of "\"host1:port1,host2:port2,...\"". This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher. |
Yes |
KAFKA_TOPIC |
Specify the name of the topic from which this component receives the feed of messages. |
Yes |
GROUP_ID |
Specify the name of the consumer group to which you want the current consumer to belong. This consumer group will be created at runtime if it does not exist at that moment. |
Yes |
RESET_OFFSET |
Set this parameter to true to clear the offsets saved for the consumer group to be used so that this consumer group is handled as a new group that has not consumed any messages. By default, this parameter is set to false. |
No |
AUTO_OFFSET_RESET |
Specify the starting point from which the messages of a topic are consumed. Acceptable values:
This parameter works only when the Kafka cluster version is Kafka 0.8.2.0. |
No |
AUTO_OFFSET_RESET_NEW |
Specify the starting point from which the messages of a topic are consumed. Acceptable values:
This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher. |
No |
KAFKA_OFFSET_STORAGE |
Specify the system to which you want to commit the offsets of the consumed messages. Acceptable values:
This parameter works only when the Kafka cluster version is Kafka 0.8.2.0. |
No |
KAFKA_DUAL_COMMIT_CHECK |
Set this parameter to true to commit the messages to both ZooKeeper and Kafka; set it to false to commit the messages only to Kafka. By default, this parameter is set to true. This parameter works only when the offset storage system is Kafka. |
No |
AUTO_COMMIT_OFFSET |
Set this parameter to true and use the KAFKA_COMMIT_INTERVAL parameter to specify a time interval to make tKafkaInput automatically save its consumption state at the end of each given time interval. By default, this parameter is set to true and the default time interval is 5000 milliseconds. Note that the offsets are committed only at the end of each interval. If your Job stops in the middle of an interval, the message consumption state within this interval is not committed. |
No |
USE_BATCH_MAX_DURATION |
Set this parameter to true and use the BATCH_MAX_DURATION parameter to specify the duration (in milliseconds) at the end of which tKafkaInput stops running. By default, this parameter is set to false and the default duration is 600000 milliseconds. |
No |
USE_BATCH_MAX_SIZE |
Set this parameter to true and use the BATCH_MAX_SIZE parameter to specify the maximum number of messages you want tKafkaInput to receive before it automatically stops running. By default, this parameter is set to false and the default maximum number of messages duration is 5000. |
No |
USE_BATCH_MESSAGE_TIMEOUT |
Set this parameter to true and use the BATCH_MESSAGE_TIMEOUT parameter to specify the time (in milliseconds) tKafkaInput must wait for a new message before it stops running. By default, this parameter is set to false and the default timeout time is 10000 milliseconds. |
No |
USE_HTTPS |
Set this parameter to true enable SSL or TLS encryption of the connection, and use the HTTPS_SETTING parameter to specify the tSetKeystore component that you use to define the encryption information. This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher. |
No |
USE_KRB |
If the Kafka cluster to be used is secured with Kerberos, set this parameter to true and use the following parameters to define the related security information:
This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher. |
No |
KAFKA_CONSUMER_ PROPERTIES {} |
If you need to use custom Kafka consumer configuration, include in this function one or more sets of the following parameters to specify the property or properties to be customized. Then at runtime, the customized property or properties will override the corresponding ones used by Talend Studio.
|
No |
KAFKA_CONSUMER_TIMEOUT |
Specify the time duration in milliseconds at the end of which you want a timeout exception to be returned if no message is available for consumption. The default value -1 means that no timeout is set. |
No |
SAVE_OFFSET |
Set this parameter to true to output the offsets of the consumed messages to the next component. When selecting it, a read-only column called offset is added to the schema. |
No |
CUSTOM_ENCODING |
In case of encoding issues when processing the stored data, set this parameter to true and use the following parameters to specify the right encoding:
|
No |
TSTATCATCHER_STATS |
Set this parameter to true to gather the processing metadata at the Job level as well as at each component level. By default, this parameter is set to false. |
No |
LABEL |
Use this parameter to specify a text label for the component. |
No |