Skip to main content

Component-specific settings for tKafkaInput

The following table describes the Job script functions and parameters that you can define in the setSettings {} function of the component.

Function/parameter Description Mandatory?

OUTPUT_TYPE

Specify the type of data to be sent to the next component.

  • STRING (default)
  • BYTES

Typically, giving this parameter a value STRING is recommended, because tKafkaInput can automatically translate the Kafka byte[] messages into strings to be processed by the Job. However, in case that the format of the Kafka messages is not known to tKafkaInput, such as Protobuf, you can use BYTES and then use a Custom code component such as tJavaRow to deserialize the messages into strings so that the other components of the same Job can process these messages.

No

USE_EXISTING_CONNECTION

Set this parameter to true and specify the name of the relevant connection component using the CONNECTION parameter to reuse the connection details you already defined.

No

KAFKA_VERSION

Specify the version of the Kafka cluster to be used. Acceptable values:

  • KAFKA_0_10_0_1
  • KAFKA_0_9_0_1
  • KAFKA_0_8_2_0

Yes

ZOOKEEPER_CONNECT

Specify the address of the ZooKeeper service of the Kafka cluster to be used, in the form of "\"zk1:port1,zk2:port2,...\"". This parameter works only when the Kafka cluster version is Kafka 0.8.2.0.

Yes

BROKER_LIST

Specify the addresses of the broker nodes of the Kafka cluster to be used, in the form of "\"host1:port1,host2:port2,...\"".

This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher.

Yes

KAFKA_TOPIC

Specify the name of the topic from which this component receives the feed of messages.

Yes

GROUP_ID

Specify the name of the consumer group to which you want the current consumer to belong.

This consumer group will be created at runtime if it does not exist at that moment.

Yes

RESET_OFFSET

Set this parameter to true to clear the offsets saved for the consumer group to be used so that this consumer group is handled as a new group that has not consumed any messages.

By default, this parameter is set to false.

No

AUTO_OFFSET_RESET

Specify the starting point from which the messages of a topic are consumed. Acceptable values:

  • SMALLEST
  • LARGEST (default)

This parameter works only when the Kafka cluster version is Kafka 0.8.2.0.

No

AUTO_OFFSET_RESET_NEW

Specify the starting point from which the messages of a topic are consumed. Acceptable values:

  • EARLIEST
  • LATEST (default)

This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher.

No

KAFKA_OFFSET_STORAGE

Specify the system to which you want to commit the offsets of the consumed messages. Acceptable values:

  • ZOOKEEPER (default)
  • KAFKA

This parameter works only when the Kafka cluster version is Kafka 0.8.2.0.

No

KAFKA_DUAL_COMMIT_CHECK

Set this parameter to true to commit the messages to both ZooKeeper and Kafka; set it to false to commit the messages only to Kafka.

By default, this parameter is set to true.

This parameter works only when the offset storage system is Kafka.

No

AUTO_COMMIT_OFFSET

Set this parameter to true and use the KAFKA_COMMIT_INTERVAL parameter to specify a time interval to make tKafkaInput automatically save its consumption state at the end of each given time interval.

By default, this parameter is set to true and the default time interval is 5000 milliseconds.

Note that the offsets are committed only at the end of each interval. If your Job stops in the middle of an interval, the message consumption state within this interval is not committed.

No

USE_BATCH_MAX_DURATION

Set this parameter to true and use the BATCH_MAX_DURATION parameter to specify the duration (in milliseconds) at the end of which tKafkaInput stops running.

By default, this parameter is set to false and the default duration is 600000 milliseconds.

No

USE_BATCH_MAX_SIZE

Set this parameter to true and use the BATCH_MAX_SIZE parameter to specify the maximum number of messages you want tKafkaInput to receive before it automatically stops running.

By default, this parameter is set to false and the default maximum number of messages duration is 5000.

No

USE_BATCH_MESSAGE_TIMEOUT

Set this parameter to true and use the BATCH_MESSAGE_TIMEOUT parameter to specify the time (in milliseconds) tKafkaInput must wait for a new message before it stops running.

By default, this parameter is set to false and the default timeout time is 10000 milliseconds.

No

USE_HTTPS

Set this parameter to true enable SSL or TLS encryption of the connection, and use the HTTPS_SETTING parameter to specify the tSetKeystore component that you use to define the encryption information.

This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher.

No

USE_KRB

If the Kafka cluster to be used is secured with Kerberos, set this parameter to true and use the following parameters to define the related security information:

  • JAAS_CONF: specify the path to the JAAS configuration file to be used by the Job to authenticate as a client to Kafka.

  • KRB_SERVICE_NAME: specify the primary part of the Kerberos principal you defined for the brokers when creating the broker cluster.

    For example, for the principal kafka/kafka1.hostname.com@EXAMPLE.COM, the value of this parameter is kafka.

  • SET_KINIT_PATH: Kerberos uses a default path to its kinit executable. If you have changed this path, set this parameter to true and use the KINIT_PATH parameter to specify the custom access path.

  • SET_KRB5_CONF: Kerberos uses a default path to its configuration file, krb5.conf (or krb5.ini in Windows) for Kerberos 5 for example. If you have changed this path, set this parameter to true and use the KRB5_CONF to specify the custom access path to the Kerberos configuration file.

This parameter works when the Kafka cluster version is Kafka 0.9.2.1 or higher.

No

KAFKA_CONSUMER_ PROPERTIES {}

If you need to use custom Kafka consumer configuration, include in this function one or more sets of the following parameters to specify the property or properties to be customized. Then at runtime, the customized property or properties will override the corresponding ones used by Talend Studio.

  • PROPERTY: Type in the name of the property.
  • VALUE: Type in the new value of the property.

No

KAFKA_CONSUMER_TIMEOUT

Specify the time duration in milliseconds at the end of which you want a timeout exception to be returned if no message is available for consumption.

The default value -1 means that no timeout is set.

No

SAVE_OFFSET

Set this parameter to true to output the offsets of the consumed messages to the next component.

When selecting it, a read-only column called offset is added to the schema.

No

CUSTOM_ENCODING

In case of encoding issues when processing the stored data, set this parameter to true and use the following parameters to specify the right encoding:

  • ENCODING
  • ENCODING:ENCODING_TYPE

No

TSTATCATCHER_STATS

Set this parameter to true to gather the processing metadata at the Job level as well as at each component level.

By default, this parameter is set to false.

No

LABEL

Use this parameter to specify a text label for the component.

No

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!