Business date extraction method dropdown
Business date can be extracted from a source header, trailer, manifest file, or file name to override the date generated at runtime. The extracted load date is applied to the directory structure in the file system and used to create partitions in HCatalog.
The following methods are supported. Users select the appropriate extraction method in the dropdowns of the property panel.
COBOL |
PATH NAME REGEX |
FDL |
PATH NAME REGEX |
JDBC |
MANIFEST REGEX |
JSON |
FILE NAME REGEX |
XML |
FILE NAME REGEX |
Key |
Value |
Meaning |
---|---|---|
Extraction Method |
TRAILER.REGEX |
Date will be extracted from the trailer, regex will match and extract the pattern |
Extraction Argument |
.*(\d{4}\.\d{2}\.\d{2}\.\d{2}\.\d{2}\.\d{2}).* |
Searches for regex pattern:
|
Date Pattern |
yyyy.MM.dd.HH.mm.ss |
Provides date format to application to enable parsing. This format instructs application to parse date with year/month/day/hour/minutes seconds. |
Note that if a time is not specified in the dataset.date.time.pattern then 00:00:00 will be used. Users are reminded that the date pattern must be specified using Java SimpleDateFormat pattern characters. Users should pay particular attention to the case of their pattern characters. Months are specified with uppercase ('MM') and minutes are specified with lower-case ('mm'). Users should also be aware of case distinctions between upper case and lower case when specifying the hours component of the time. Generally, uppercase ('HH') designates a 24-hour clock with the range [00-23]. If this is not a viable option, then be sure to use lower case 'hh' along with 'aa' in dataset.date.time.pattern in order to match the pattern of DateTime provided.
The following properties can be set to define method, argument, date pattern, and manifest file location.
Key |
Value |
---|---|
Date Extraction Method: dataset.date.time.extraction.method |
Dropdown Values: NONE Definitions: |
Date Extraction Argument: dataset.date.time.extraction.argument |
Value for this property will be either the 'Field Name' (when extraction method value is COBOL_HEADER_FIELD COBOL_TRAILER_FIELD or standard regular expression arguments (when extraction method is regex). DELIMITED_COLUMN_INDEX: Users can enter any delimiter and index number for extraction in either the Header, Trailer, or Manifest. For example, if the desired argument is the second index (starting from 0) in the following:" A|;B|;C|;D|;" one would specify "2 |;". Note the space between 2 and the delimiter (in this case a Pipe and semi-colon). |
Date Pattern dataset.date.time.pattern |
This value describes what format the date is in so that the date can be accurately interpreted. Patterns are defined using Java SimpleDateFormat pattern specification. Example:
MM/dd/yy |
Manifest File Location dataset.manifest.file.glob |
Location of Manifest File Glob when Date Extraction Method is MANIFEST_REGEX |