Ingest from an external source regardless of where it has been sourced (RDBMS, HDFS or a cloud storage service such as Amazon S3, Azure ADLS and Azure WASB, Hive, local server, etc.) leverages two key steps required to onboard data into Qlik Data Catalyst:
- Defining the Source/Entity (metadata)
- Ingest (data)
Once sources have been defined and metadata is in place, data can be loaded from the source. To load data into an entity, navigate to the entity, highlight the row and select
A popup appears for the user to assign an editable date and timestamp to the dataload. Select OK.
This will begin the dataload. Depending on the amount of the data ingest and the speed of the connection to the source system, this may take several minutes. If the data is several gigabytes or larger the load may take significantly longer. The status of the load is shown in (Job Status), a RUNNING status will appear until the load has FINISHED. To refresh logs and monitor the load status select the load row and select Reload Logs from the Bulk Action dropdown. When users first arrive at the load screen and data is loading for the first time, click on Refresh to initiate the load.
INITIALIZED loads: When jobs are queued but have yet to start running they are in an INITIALIZED state. In the context of
Refresh Load Logs to refresh the load status or set an Auto Refresh interval from the dropdown options:
- No Auto Refresh [default]
Upon completion, status will show as
Completion status will show as FINISHED or FAILED and the log will show results of the action. When Job Status is FINISHED the records provide totals of Good Record Count, Bad Record Count, Ugly Record Count, and Filtered Record Count.
Expand the round '+' symbol to the right of the record counts to view messages with information about why records may have excepted as
When Job Status is FAILED, select View Properties from the action dropdown to open Data Load Information.
Load Log contains details regarding why the load failed.