tHiveOutput
Connects to a given Hive database and writes the data it receives into a given Hive table or a directory in HDFS.
When ACID is enabled on the Hive side, a Spark Job cannot delete or update a table and unless data is compacted, this Job cannot correctly read aggregated data from a Hive table, either. This is a known limitation described in the Spark bug tracking system: https://issues.apache.org/jira/browse/SPARK-15348.
Depending on the Talend product you are using, this component can be used in one, some or all of the following Job frameworks:
-
Spark Batch: see tHiveOutput properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.
-
Spark Streaming: see tHiveOutput properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.