Skip to main content Skip to complementary content

Change Data Partitioning on Hadoop

When Change Data Partitioning is enabled, the Replicate Change Tables in Hive are partitioned by the partition_name column. Data files are uploaded to HDFS, according to the maximum size and time definition, and then stored in a directory under the Change Table directory. Whenever the specified partition timeframe ends, a partition is created in Hive, pointing to the HDFS directory.

Information about the partitions is written to the attrep_cdc_partitions Control Table.

Prerequisites

The prerequisites for using Change Data Partitioning with a Hadoop target endpoint are as follows:

  • The target file format must be set to Text or Sequence
  • Hive access must be set to ODBC

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!