Load logs: Data load information
Access Load Logs for sources and entities by selecting Load Logs from the More dropdown on the object row.
To view Load Logs options, select (actions) for options View Details, View File System Logs, View Bad Records, View Ugly Records, Terminate (enabled only if the job status is still RUNNING), Delete, and view the object in Discover. If View Details is selected, once the load modal appears users have additional tab options to view General Information about the object, Load Log, Load Properties, Source Properties, Entity Properties, File System Logs, Bad Records, and Ugly Records.
Correlating Bad and Ugly logs to Bad and Ugly Records
When data records fail quality tests, they are partitioned and flagged as Bad or Ugly records. Error logs are tied to the Bad and Ugly records; the messages are intended to help customers to identify, understand, and correct data quality issues presented by ingested data.
Bad and Ugly records are identified by an internal record id ( partition_id.internal_location_id). The partition_id can be either configured by users upon ingest or (by default) the delivery date (YearMonthDayHourMinuteSecond) time stamp of ingest. Internal location id is the character offset marker of the record or field. For example, if the character offset is "490", that refers to the 409th character of the file that was loaded. File System Logs tab provides the entire internal record id and the reason the record was flagged as Bad or Ugly.
When records are flagged as Bad and Ugly they can be viewed in Bad Records and Ugly Records tabs found in the Data Load Information modal accessed by selecting Load Logs from the More dropdown in the object row of source grids. View File System Logs and Bad and Ugly tabs can also be accessed directly from the loading screen that appears when an entity is loading.
File System Logs tab displays Location(internal_location_id), Message, and Record content. Note that Record column will display the first field records when the record is Bad (because Bad records are a broader record issue) and the specific offending field when the record is Ugly (Ugly records are specific field related issues).
Location | Message | Record |
---|---|---|
20200805002509.1744 | expected 5 fields but found 4 ... field enclosures not specified ... perhaps your data has quoted fields? | CUSR0000SA0 |
20200805002509.1847 | expected 5 fields but found 4 ... field enclosures not specified ... perhaps your data has quoted fields? | CUSR0000SA0 |
20200805002509.490 | contains unexpected control character | 22.\n10 |
Bad Records tab displays the Record (fields) and Location (internal_location_id). As Bad records indicate a record-wide quality issue, the whole record is displayed.
Record | Location |
---|---|
CUSR0000SA0 \t1949\tM10\t 23.67 | 20200805002509.1744 |
CUSR0000SA0 \t1949\tM12\t 23.61 | 20200805002509.1847 |
CUSR0000SA0 \t1950\tM02\t 23.61 | 20200805002509.1898 |
Ugly Records tab displays the Record (fields) individually and Location (internal_location_id) identifies the field that has been flagged. Ugly records have field related issues (such as a control character found in a field).
series_id | year | period | value | footnote_codes | Location |
---|---|---|---|---|---|
CUSR0000SA0 | 1947 | M10 | 23.\n10 | - | 20200805002509.490 |
CUSR0000SA0 | 1947 | M11 | 23.\n10 | - | 20200805002509.543 |
CUSR0000SA0 | 1947 | M12 | 23.\n10 | - | 20200805002509.596 |
CUSR0000SA0 | 1948 | M01 | 22.\n10 | - | 20200805002509.649 |
Deleting Load Logs
When a user selects Delete from the action dropdown, they are given three options:
- Delete Load Logs: Deletes Log (removed from UI)
- Drop Table Partitions for Load: Deletes this load from the Distribution table for the Entity
- Delete data in File System (data, logs, profile data, sample data): Deletes everything associated with that data load from HDFS or File System in place
Note that the options chosen on this screen will become the (cached) default log deletion configuration for that log. For example, if Drop Table Partitions for Load is the only log deletion option selected, that setting will reappear for that log unless Delete Load Logs was selected in which case the log has been deleted. Note that if Delete data in File System (data, logs, profile data, sample data) is selected, distribution tables will also be deleted and Drop Table Partitions for Load becomes selected by default.