Mainframe use case: Filtering on multiple record types (sub-01 Level)
The property (cobol.record.filter.string) allows COBOL layouts with multiple record types to be processed and filtered to create an entity with only records of matching record type.
The following example shows how to load a COBOL copybook and data from a Mainframe file filtered on record type so that the entity that is created will consist only of records where the loan type <LOAN-TYPE> is 'AUTO'
The example below shows
- An entity loaded without the record filtering property
- Definition of the cobol.record.filter.string property to extract only records with loan type 'AUTO'
- Sample data for the loaded entity filtered on record type 'AUTO'
Note that this property must be manually added to the properties panel.
Step 1. Connect to the data source and provide the information required to establish the connection. Initiate the Add Data wizard and select To New Source
Step 2a. Name the entity to be created and the filepath/location of sourced data. Upload the copybook for the dataset.
Step 2b. Select the Groups that will require access to the data.
Step 3. Select the entity or entities to be created. Entities can be renamed. To display entity fields expand the caret to the left of the entity name. Click on Next.
Step 4. Select Internal file format Type. Save the Entity.
Note that in single node environments, TEXT_TAB_DELIMITED is the only option for format type.
Once the copybook has been uploaded and metadata definition steps are complete, navigate to the entities screen, locate the entity that was created (in this case 'AUTO-LOAN') and click on Load.
Job status shows that the entity was created and metadata and data loaded but because the file contains multiple record types (LOAN-TYPEs) the entity ('MF_1') loads with some Bad records. In this case, Good Record Count is 3, and Bad Record Count is 7.
Property Creation and Definition:
Navigate to the entity properties tab by selecting Details or View/Edit Properties from the More dropdown (for the example, the entity is 'AUTO-LOAN'). Locate cobol.copybook property and click on the
(view) icon to open the copybook in a larger field window.
In the copybook, locate the statement that describes the variable to be filtered on (in the example below 'LOAN-TYPE PIC.X(4)'). The copybook lists the values that the variable can be filtered on. This example will modify the entity specification to include records of type 'AUTO' and exclude records of type 'HOME' and 'EDUC'.
Copybook example text:
01 AUTO-LOAN. 20 RECORD-TYPE PIC X(4). 88 LOAN-RECORD-FLAG VALUE 'LOAN'. 20 LOAN-TYPE PIC X(4). 88 AUTO-LOAN-FLAG VALUE 'AUTO'. 88 HOME-LOAN-FLAG VALUE 'HOME'. 88 EDUC-LOAN-FLAG VALUE 'EDUC'. 20 LOAN-NUMBER PIC 9(4). 20 CUSTOMER-ID PIC X(4). 20 LOAN-AMOUNT PIC 9(6)V9(2). 20 AUTO-TYPE PIC X(8). 20 AUTO-MODEL-YEAR PIC 9(4). 20 FILLER PIC X(28).
Copybook example screenshot

Locate 'cobol.record.filter.string' property
Enter "include record when <FIELD-NAME> == <'constant-value'>;"
In this example the property value is: 'include record when LOAN-TYPE == 'AUTO';'
The newly defined property specifies that only records with a value of AUTO in the LOAN-TYPE field will be extracted for this entity.
Once entered, Save the definition.
Reload data for the entity 'AUTO-LOAN'. The record filter string has extracted the total Good records that satisfy the AUTO-LOAN record type criteria. Note that the Record Count displays only the number of records adjusted for the records that were filtered out. Therefore, there is a total number of 3 records, and they are all LOAN-TYPE == 'AUTO'.
Navigate to the discover module to view sample data for the entity.'AUTO-LOAN'
RECORD_TYPE | LOAN_TYPE | LOAN_NUMBER | CUSTOMER_ID | LOAN_AMOUNT | AUTO_TYPE | AUTO_MODEL_YEAR |
---|---|---|---|---|---|---|
LOAN | AUTO | 1 | elon | 99999.88 | Tesla | 2015 |
LOAN | AUTO | 6 | jack | 2611 | Chevy | 1957 |
LOAN | AUTO | 7 | mcqn | 2611 | Mustang | 1968 |
Screenshot of sample data for the entity 'AUTO-LOAN'

Processing RDW (Record Descriptor Word) and BDW (Block Descriptor Word) variations
Users may require extended support for processing Mainframe Variable Block files where record layout variations include Record Descriptor Word (a 4 byte field reserved at the beginning of each record describing the record) only or a combination of both Record Descriptor Word (RDW) and Block Descriptor Word (BDW), a 4 byte field at the beginning of each block describing the block. Treatment of these variations requires specification of record.layout property in the entity property panel before data is loaded.
Ingest for Variable Block files requires an additional property that must be manually specified before data is loaded: record.layout
Options for this property value include: MAINFRAME_RDW_BDW and MAINFRAME_RDW_ONLY
Enhanced filtering on multiple record types
Users can filter on record type and define criteria on which to also filter on field value with logical operators and multi-operation constructor functions.
Example: cobol.record.filter.string = include record when MANUFACTURER == 'LYCOMING' and HORSEPOWER >= 300;
The resulting dataset will return records for engines manufactured by LYCOMING with horsepower greater than or equal to 300.
The following operators are supported for filtering on record types:
Supported Query Operator | Means | Criteria/Return |
---|---|---|
> |
greater than |
|
< |
less than |
|
>= |
greater than or equal to |
|
<= |
Less than or equal to |
|
== |
equal to |
|
!= |
not equal to |
|
+ |
addition |
|
- |
subtraction/ negation |
|
* |
multiplication |
|
/ |
division |
|
% |
modulo |
returns the modulo (remainder after division) of two numbers (e.g.,'9%4'will return 1) |
=~ |
regex |
returns values that match specified pattern |
; |
semicolon |
terminates expressions, lets the compiler know that the line as finished |
() |
parentheses |
grouping for operator precedence |
ID |
ID |
field name |
IS NULL IS NOT NULL |
IS NULL |
tests if the value is a database NULL |
IS NUMERIC IS NOT NUMERIC |
a number |
tests if value is a valid NUMERIC value |
true |
Boolean (true) literal |
returns records that satisfy criteria |
false |
Boolean (false) literal |
returns records that do not satisfy criteria |
'string' |
string literal |
returns a constant-value syntactic string (chararray) expression |
Supported Query Operator | Means | Criteria/Return |
---|---|---|
|
||
|| (double pipe) OR |
or |
returned items match either value |
AND && |
and |
returned items must match both values |
! NOT |
not |
returned items must not match the following value |
Note the following specifications around cobol.record.filter.string:
<FIELD-NAME> can be of types:
PIC X
Group item
PIC 9 (unsigned DISPLAY)
PIC 9 COMP (unsigned BINARY)
PIC S9 COMP (signed BINARY)
PIC 9 COMP-3 (unsigned PACKED-DECIMAL)
PIC S9 COMP-3 (signed PACKED-DECIMAL)
<constant-value> can be (single or double) quoted for the following types:
PIC X
Group item
PIC 9 (unsigned DISPLAY)
*Quoted values must contain the same number of characters as the filtered item. Space pad on the right if necessary.
<constant-value> can be a non-negative integer for the following numeric types:
PIC 9 (unsigned DISPLAY)
PIC 9 COMP (unsigned BINARY)
PIC S9 COMP (signed BINARY)
PIC 9 COMP-3 (unsigned PACKED-DECIMAL)
PIC S9 COMP-3 (signed PACKED-DECIMAL)
*Numeric constant values must fit in the space defined by the item.
For character types the <constant-value> is a (single or double) quoted string of the exact same length as <FIELD-NAME>
The length of the string constant must be the same number of characters in the RECORD-TYPE-FIELD
Leading "include record when" and trailing semi-colon ';' are required
<FIELD-NAME> can represent an item in of either character or numeric type
Note that PIC 9 (unsigned DISPLAY) represents a numeric value in decimal format and supports either a quoted or unquoted integer value
Byte order is assumed to be big-endian / IBM mainframe format
For numeric types the <constant-value> is a non-negative integer