tBoundedStreamInput properties for Apache Spark Streaming
These properties are used to configure tBoundedStreamInput running in the Spark Streaming Job framework.
The Spark Streaming tBoundedStreamInput component belongs to the Technical family.
This component is available in Talend Real-Time Big Data Platform and Talend Data Fabric.
Basic settings
Schema and Edit Schema |
A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word line when naming the fields. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:
|
|
Built-In: You create and store the schema locally for this component only. |
|
Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. |
Mode |
Select the mode that you want to use to generate the data stream.
In either mode, the data you provide must use the separators you have defined in the Row separator, Field Separator and Micro batch separator fields. |
Usage
Usage rule |
This component is used as a start component and requires an output link. This component is added automatically to a test case being created to provide input data. |
Spark Connection |
In the Spark
Configuration tab in the Run
view, define the connection to a given Spark cluster for the whole Job. In
addition, since the Job expects its dependent jar files for execution, you must
specify the directory in the file system to which these jar files are
transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |