Skip to main content

Qlik Catalog record count reconciliation

For many enterprises, reconciliation of record counts is a key feature for accurate ingest of business data. Qlik Catalog provides a mechanism for users to define a method/location from which to extract expected record count for reconciliation with observed record count upon load. If expected and observed record counts are not equal the load will fail, alerting users of the mismatch and generating a log entry indicating the discrepancy.

Qlik Catalog supports record count reconciliation for entities resulting from ingest of multiple files. The files for ingestion may have an expected record count in the header or trailer or a separate manifest file. In this scenario, the file pattern for ingestion might be a wildcard (with a specified search pattern (e.g., foo/*/bar, /receiptdate*/). In cases where a wildcard is specified (in src.file.glob) and an ExpectedRecordCount is found in the header or trailer of each file, the application will sum the individual file ExpectedRecordCounts to create a totalExpectedRecordCount that is used for the partition or work order. If a separate manifest file is provided then the ExpectedRecordCount will be used in the manifest as the WorkOrder.expectedRecordCount, regardless of the number of files specified by the src.file.glob.

The following methods are supported by format. Users select the appropriate extraction method via dropdowns in the property panel.

Record count extraction methods

COBOL

PATH NAME REGEX

FILE NAME REGEX

MANIFEST REGEX

MANIFEST ENTIRELY

COBOL HEADER FIELD

COBOL TRAILER FIELD

FDL

PATH NAME REGEX

FILE NAME REGEX

HEADER REGEX

TRAILER REGEX

MANIFEST REGEX

HEADER ENTIRELY

TRAILER ENTIRELY

MANIFEST ENTIRELY

HEADER_DELIMITED_COLUMN_INDEX

TRAILER_DELIMITED_COLUMN_INDEX

MANIFEST_DELIMITED_COLUMN_INDEX

JDBC

Not currently supported

XML

XML ELEMENT

Record count extraction method dropdown

Record count extraction method values
Configuration examples: Record count extraction properties

Key

Value

Meaning

Extraction Method

TRAILER_REGEX

Trailer holds Record Count, regex will match and extract the pattern

Extraction Argument

(\d\d\d\d)\s\s?

Searches for regex pattern:

  • (\d\d\d\d) : Capturing group
  • \d : Match a digit [0-9]
  • \d : Match a digit [0-9]
  • \d : Match a digit [0-9]
  • \d : Match a digit [0-9]
  • \s : Match any white space char [\r\n\t\f]
  • \s : Match any white space char

    [\r\n\t\f]

  • ? : Quantifier: Between zero and one time, as many times as possible.

Record Count Adjustment

0

No adjustment for Header/Trailer

 

The following properties can be set to define method, argument, record count adjustment (e.g., adjusting for header and trailer), and manifest file location.

Record Count Extraction Key/Value Properties
Key Value

Record Count Extraction Method

dataset.expected.record.count.extraction.method

Dropdown Values:

  • NONE
  • PATH_NAME_REGEX
  • FILE_NAME_REGEX
  • HEADER_REGEX
  • TRAILER_REGEX
  • MANIFEST_REGEX
  • HEADER_ENTIRELY
  • TRAILER_ENTIRELY
  • MANIFEST_ENTIRELY
  • HEADER_DELIMITED_COLUMN_INDEX
  • TRAILER_DELIMITED_COLUMN_INDEX
  • MANIFEST_DELIMITED_COLUMN_INDEX
  • COBOL_HEADER_FIELD
  • COBOL_TRAILER_FIELD
  • XML_ELEMENT

Definitions:

  • PATH_NAME: source file path
  • FILE_NAME: source file name
  • REGEX: sequence of characters
  • defining a search pattern
  • MANIFEST: file accompanying data holding metadata
  • ENTIRELY: the trailer or header holds the record count (and nothing else)
  • FIELD: a field name containing the record count (in the trailer or header) is provided as date extraction argument value
  • COLUMN INDEX: column order number
  • XML_ELEMENT: a mark up entry within an XML file that can include text. example: <price>29.99</price> (attribute values also supported)

Record Count Extraction Argument

dataset.expected.record.count.extraction.argument

Value for this property will be either the 'Field Name' (when extraction method value is

COBOL_HEADER_FIELD

COBOL_TRAILER_FIELD

or standard regular expression arguments (when extraction method is regex).

Regular expressions require a pattern which matches the entire string and which specify exactly one capturing group.

When using XML_ELEMENT method to extract record count, this argument is the XPATH expression to navigate to the element (or attribute) locationcontaining the record count value. Example value: NotificationFileTrailer/RecordCount

DELIMITED_COLUMN_INDEX: Users can enter any delimiter and index number for extraction in either the Header, Trailer, or Manifest.

For example, if the desired argument is the second index (starting from 0) in the following:" A|;B|;C|;D|;" one would specify "2 |;". Note the space between 2 and the delimiter (in this case a Pipe and semi-colon).

Record Count Adjustment Number

dataset.expected.record.count.adjustment

 

Record offset number adjusts record count

0 or no entry = record count in the dataset exactly specifies the number of data records.

1 = record count in the dataset includes 1 for a Header (or Trailer) record. 

2 = record count in the dataset includes 1 for a Header record plus 1 for a Trailer record.

Other = FDL or XML may require an adjustment between cat and expected record count

Manifest File Location

dataset.manifest.file.glob

Location of Manifest File Glob when Record Count Extraction Method is MANIFEST_REGEX or MANIFEST_ENTIRELY

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!