Skip to main content Skip to complementary content

Change Data Capture (CDC)

Change Data Capture CDC controller

The CDC operator full-joins records from 2 inputs (a primary and secondary entity) based on user-specified join key or keys.

The operator compares the fields specified through user-specified compare criteria.

From this comparison; (I) Inserted, (U) Updated, (D) Deleted records /(N) No-change /(E) Error records are classified in output (through Record Output setting) as either:

Changed Only (default): Output will contain a union of all records (I) Inserted /(U) Updated /(D) Deleted between two source entities as indicated in the IUD field

Full: Output will contain a union of all records (I) Inserted / (U) Updated / (D) Deleted /(N) No-change between two source entities as indicated in the IUD field

This target entity contains a union of all records and has an IUD field (podium_iud), which indicates fields' change status.

Change Data Capture dataflow displaying iud field

Application of CDC package

To apply the CDC operator, select and add 2 entities (a primary baseline and the secondary entity that will provide the change data) onto the canvas

Once the CDC package is dragged to the canvas and connected to entities, define CDC operator settings.

CDC requirements

CDC package expects two inputs (one for primary and one for the newer secondary table) and both entities should have the same record format (name, data type, index).

The order in which the ports are connected has an impact on the functionality. The secondary entity added to the graph (should also be the "changed" entity) should be connected first followed by the primary package.

CDC operator must always be the last component in a dataflow (inputs to the CDC must be last in the sequence of other controls; Filter, Router, Transform, etc.) for dataflow transformations to factor into the calculation,

CDC operator Settings

  • ADD CRITERIA:Specify fields (join keys) for Full Join or Swap where the right entity takes on join primacy and data change is captured in this direction from secondary entity to primary entity.

Change Data Capture field on which to join criteria added

COMPARE

  • ADD COMPARE: Add and specify fields for comparison between entities.  If Compare criteria are empty, all fields except for the join keys are compared.

Change Data Capture fields to compare

SETTING

  • Log Changes: If checked (default), the output will create two sub-folders in the receiving directory of the master entity. One '/good' folder will contain the output data as usual and the '/log' folder will contain the log of changes, if any. If unchecked, no log will be created.

 

Change Data Capture setting for logs and record output

Logging output will list compare key and all columns (fields) with the updated fields showing old and new values if change occurred:

Log Format

compare key(s)

value

column 1 (unchanged)

value

column (unchanged)

value

column 3 (changed)

old value | new value

(example)

nid

 

Anne

 

Townsend

 

333-444-5555|333-444-5000

Record output settings:

  • Changed only (default): Target will contain only DUI records
  • Full: Target will contain DUIN records (a full set)
  • D: Delete
  • U: Update (change in existing record)
  • I: Insert (new records)
  • N: Unchanged

Define and Add Target Entity

Save, Validate, and Execute the dataflow.

Example of executed CDC dataflow target entity with sample data

CDC target entity shows column indicating change status with D, I, U, or N

Behavior of CDC logs

  • Logs will be saved in the first entity if multiple targets are selected through Router.
  • Logs will not be filtered if the Filter is applied after the CDC control.
  • Logs will be filtered if the Filter is applied before the CDC control.

 

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!