tCassandraOutput Standard properties
These properties are used to configure tCassandraOutput running in the Standard Job framework.
The Standard tCassandraOutput component belongs to the Big Data and the Databases NoSQL families.
The component in this framework is available in all Talend products with Big Data and in Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally. Repository: Select the repository file where the properties are stored. |
Use existing connection |
Select this check box and in the Component List drop-down list, select the desired connection component to reuse the connection details you already defined. |
DB Version |
Select the Cassandra version you are using. |
Host |
Enter the hostname or IP address of the Cassandra server. |
Port |
Enter the listening port number of the Cassandra server. |
Datacenter |
Enter the name of the Cassandra datacenter. |
Required authentication |
Select this check box to provide credentials for the Cassandra authentication. This check box appears only if you do not select the Use existing connection check box. |
Username |
Fill in this field with the username for the Cassandra authentication. This field is only available when you select the Required authentication check box. |
Password |
Fill in this field with the password for the Cassandra authentication. To enter the password, click the [...] button next to the password field, enter the password in double quotes in the pop-up dialog box, and click OK to save the settings. This field is only available when you select the Required authentication check box. |
Use SSL |
Select this check box to enable the SSL or TLS encrypted connection. Then you need to use the tSetKeystore component in the same Job to specify the encryption information. |
Keyspace |
Type in the name of the keyspace into which you want to write data. |
Action on keyspace |
Select the operation you want to perform on the keyspace to be used:
|
Column family |
Type in the name of the keyspace into which you want to write data. |
Action on column family |
Select the operation you want to perform on the column family to be used:
This field is only available when you select either Insert or Update from the Action on data drop-down list. |
Action on data |
On the data of the table defined, you can perform:
For more advanced actions, use the Advanced settings view. |
Schema and Edit schema |
A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word line when naming the fields. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:
|
|
Built-In: You create and store the schema locally for this component only. |
|
Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. When the schema to be reused has default values that are integers or functions, ensure that these default values are not enclosed within quotation marks. If they are, you must remove the quotation marks manually. For more information, see Retrieving table schemas. |
Advanced settings
Batch Size |
Number of lines in each processed batch. When you are using the Datastax API, this feature is displayed only when you have selected the Use unlogged batch check box. |
Use unlogged batch |
Select this check box to handle data in batch but with Cassandra's UNLOGGED approach. This feature is available to the following three actions: Insert, Update and Delete. Then you need to configure how the batch mode works:
The ideal situation to use batches with Cassandra is when a small number of tables must synchronize the data to be inserted or updated. In this UNLOGGED approach, the Job does not write batches into Cassandra's batchlog system and thus avoids the performance issue incurred by this writing. For further information about Cassandra BATCH statement and UNLOGGED approach, see Batches. |
Insert if not exists |
Select this check box to insert rows. This row insertion takes place only when they do not exist in the target table. This feature is available to the Insert action only. |
Delete if exists |
Select this check box to remove from the target table only the rows that have the same records in the incoming flow. This feature is available only to the Delete action. |
Use TTL |
Select this check box to write the TTL data in the target table. In the column list that is displayed, you need to select the column to be used as the TTL column. The DB type of this column must be Int. This feature is available to the Insert action and the Update action only. |
Use Timestamp |
Select this check box to write the timestamp data in the target table. In the column list that is displayed, you need to select the column to be used to store the timestamp data. The DB type of this column must be BigInt. This feature is available to the following actions: Insert, Update and Delete. |
IF condition |
Add the condition to be met for the Update or the Delete action to take place. This condition allows you to be more precise about the columns to be updated or deleted. |
Special assignment operation |
Complete this table to construct advanced SET commands of Cassandra to make the Update action more specific. For example, add a record to the beginning or a particular position of a given column. In the Update column column of this table, you need to
select the column to be updated and then select the operations to be used from the Operation column. The following operations are available:
|
Row key in the List type |
Select the column to be used to construct the WHERE clause of Cassandra to perform the Update or the Delete action on only selected rows. The column(s) to be used in this table should be from the set of the Primary key columns of the Cassandra table. |
Delete collection column based on position/key |
Select the column to be used as reference to locate the particular row(s) to be removed. This feature is available only to the Delete action. |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job level as well as at each component level. |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer. ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box. A Flow variable functions during the execution of a component while an After variable functions after the execution of the component. To fill up a field or expression with a variable, press Ctrl+Space to access the variable list and choose the variable to use from it. For more information about variables, see Using contexts and variables. |
Usage
Usage rule |
This component is used as an output component and it always needs an incoming link. |