Skip to main content

Data ingest: Loading data

Ingest from an external source regardless of where it has been sourced (RDBMS, HDFS or a cloud storage service such as Amazon S3, Azure ADLS and Azure WASB, Hive, local server, etc.) leverages two key steps required to onboard data into Qlik Data Catalyst:

  • Defining the Source/Entity (metadata)
  • Ingest (data)

Once sources have been defined and metadata is in place, data can be loaded from the source. To load data into an entity, navigate to the entity, highlight the row and select Load from the dropdown.

Loading data

A popup appears for the user to assign an editable date and timestamp to the dataload. Select OK.

Data load popup

This will begin the dataload. Depending on the amount of the data ingest and the speed of the connection to the source system, this may take several minutes. If the data is several gigabytes or larger the load may take significantly longer. The status of the load is shown in (Job Status), a RUNNING status will appear until the load has FINISHED. To refresh logs and monitor the load status select the load row and select Reload Logs from the Bulk Action dropdown. When users first arrive at the load screen and data is loading for the first time, click on Refresh to initiate the load.

INITIALIZED loads: When jobs are queued but have yet to start running they are in an INITIALIZED state. In the context of QVD loads where QVD entities are initialized but have not yet started loading, users may see this status lingering for longer than is typical for non-QVD entities. A maximum of five QVD data loads in INITIALIZED state at a time are allowed. Note behavior for INITIALIZED loads when Tomcat is restarted: Loads in INITIALIZED state when Tomcat is started will remain in the INITIALIZED state and not convert to RUNNING state after a restart but will FAIL after a mandatory two hour waiting interval. In contrast, jobs that are in a RUNNING state when Tomcat restarts are killed and FAIL immediately.

Refresh Load Logs to refresh the load status or set an Auto Refresh interval from the dropdown options:

  • No Auto Refresh [default]
  • 15s
  • 30s
  • 60s

Refreshing load logs

Upon completion, status will show as FINISHED or FAILED and show results of the load:

Completed refresh

Completion status will show as FINISHED or FAILED and the log will show results of the action. When Job Status is FINISHED the records provide totals of Good Record Count, Bad Record Count, Ugly Record Count, and Filtered Record Count.

Record count

Expand the round '+' symbol to the right of the record counts to view messages with information about why records may have excepted as Bad or Ugly.

Viewing record count messages

When Job Status is FAILED, select View Properties from the action dropdown to open Data Load Information.

Load Log contains details regarding why the load failed.

The Load Log

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!