Troubleshooting data tasks

This section describes problems that can occur when working with data tasks and how to troubleshoot.

Troubleshooting environmental errors

When a data task encounters an environmental error, for example, timeouts, network errors, or connection error, the data task will retry the operation automatically. If the error is not resolved after retrying, the data task stops running, and shows the status Error with an error message.

Landing tasks with data sources that are only accessible via Data Movement gateway:

The operation is retried an infinite number of times, with an interval of 5 seconds.

If the outage is long, the interval is doubled until an interval of 1800 seconds is reached.
Landing tasks with data sources that are accessible without Data Movement gateway, Storage tasks, Transform tasks and Data mart tasks:

The operation is retried 3 times, with an interval of 1 second.

Do the following:

Resolve the error using the error message.
Reload or resume operation of the data task.

Troubleshooting issues with a specific table

When a data task encounters an error while writing to a specific table, the data task will continue running. The table in error will show the status Error with an error message.

Resolve the error using the error message.
Reload the table that was in error.

Troubleshooting CDC issues

Landing data tasks with Full load & CDC update mode can encounter CDC related issues that affect the entire task, and that cannot be resolved by reloading specific tables. Examples of issues are missing events, issues caused by source database reorganization, or failure when reading source database events.

You can reload all tables to the target to resolve such issues.

Stop the data task and all tasks that consume it.
Open the data task and select the Monitor tab.
Click ..., and then Reload target.

This will reload all tables to the target using Drop-Create, and will restart all change data capture from now.

Storage tasks that consume the landing data task will be reloaded via compare and apply at their next run to get in sync. Existing history will be kept. Type 2 history will be updated to reflect changes after the reload and compare process is executed.

The timestamp for the from date in the type 2 history will reflect the reload date, and not necessarily the date the change occurred in the source.
Storage live views will not be reliable during the reload target operation, and until the storage is in sync. Storage will be fully synced when:
- All tables are reloaded using compare and apply,
- One cycle of changes is performed for each table.

For more information, see Reloading all tables to the target.

NULL values in primary key columns

You may receive an error message when executing a data task: Unknown execution error - NULL result in a non-nullable column.

Possible cause

Columns used as a primary key must not contain NULL values, and should be non-nullable.

Proposed action

In the source data task, add an expression that converts all NULL values to a value, for example, 0.

You can also select another column to use as primary key.

Casting error when using Redshift as data platform

You may get the following error or similar when using Redshift as data platform: Failed to find conversion function from “unknown” to character varying

Possible cause

Missing casting of a constant expression. This may happen more frequently in data marts due to the higher complexity of the final query .

Proposed action

Cast the constant expression as text.

Example:

cast ('my constant string' as Text)

Ambiguous column names

When you register data based on a view created in a Qlik Talend Data Integration pipeline, the view may contain columns that were generated by Qlik Talend Data Integration. The names of these columns, starting with hdr__, are reserved. When a column with a reserved name is consumed in a storage task, the storage task will create columns with the same reserved name, leading to a naming conflict. For example, you can have two columns named hdr__key_hash.

For more information about reserved columns name in views, see Views.

Proposed action

Rename the column that comes from the registered data task in the storage data task. For example, rename hdr__key_hash to my__key_hash.

Troubleshooting a data task based on Data Movement gateway

You can get information about landing operations for data tasks based on Data Movement gateway by inspecting log files. You can also set the level of logging. Logs are available when the data task has completed its first run.

To view log files, you need one of the following permissions in the space where the data task resides:

Owner
Can operate

You also need one of the following permissions in the space where the data gateway resides:

Can consume
Can edit
Can manage

Viewing log files

Open the log viewer by clicking View logs in a landing data task based on Data Movement gateway. You can select which log file to view under Replication engine logs. You can scroll to the top and the bottom of the file with Arrow to scroll to top and Arrow to scroll to bottom .

The view of the log file is not updated automatically with the latest messages. To update, click Arrow to scroll to bottom

to scroll to the end of the log file refreshed with the latest messages.

Setting logging options

You can set the level of logging for different operations of the replication under Logging options.

Storing trace and verbose logging in memory

When the logging level is set to "Trace" or "Verbose", you can instruct Qlik Talend Data Integration to store the logging information in memory until an error occurs. On detecting an error, Qlik Talend Data Integration will begin writing to the physical logs and continue to do so for a few minutes after the initial occurrence of the error.

If no error occurs before the allocated memory is used up, Qlik Talend Data Integration will empty the memory buffer and start afresh.

This option is useful for tasks that fail unpredictably and for no obvious reason. The problem with continually writing large amounts of information to the logs is twofold:

Running in "Trace" or "Verbose" logging mode will quickly use up available disk space (unless the logging settings have been configured to prevent this).
Continually writing large amounts of data to the logs will affect performance.

To use this option

Select the Store trace/verbose logging in memory, but if an error occurs, write to the logs check box at the top of the tab.
In the Allocate memory up to (MB) field, specify the amount of memory you want to allocate for storing logging information.

Setting logging levels

You can set the following levels:

1. Error

Show error messages.
2. Warning

Show warnings.
3. Info

Show informational messages.
4. Debug

Show additional information for troubleshooting purposes.
5. Detailed debug

Show detailed information for troubleshooting purposes.

The higher levels always include the messages from the lower levels. Therefore, if you select Error, only error messages are written to the log. However, if you select Info, informational messages, warnings, and error messages are included. In general, using the levels Debug and Detailed debug may generate large amounts of log data.

You can use Global to set the same level for all operations, or set the level individually for each operation.

Source - full load

Logs activity related to full load operations in the data source. This includes the SELECT statements executed against the source tables prior to full load.
Source - CDC

Logs activity related to CDC operations in the data source.

Warning noteSetting this to Detailed debug level will generate very large amounts of data to the log.
Source – data

Detailed logging of data source activity related to full load and CDC operations.
Target - full load

Logs activity related to full load operations on the target.
Target - CDC

Logs activity related to CDC operations on the target.
Target – upload

Logs activity when files are transferred to the target.
Extended CDC

Detailed logging of CDC activity, such as synchronization and storage of transactions.
Performance

Logs latency values for source and target.
Metadata

Logs activity related to reading metadata, as well as metadata changes. Status of the replication task is also logged.
Infrastructure

Logs infrastructure information, file system operations, and task management.
Transformation

Logs information related to transformations that are performed.

Downloading diagnostic files

You can download a diagnostic package, task logs, and a memory report to assist you when troubleshooting the replication task associated with the landing task. You can only download one item at a time.

In the log viewer, click Download to expand.
Select the item to download.
Click Download.

The file will either be downloaded to your computer or you will be prompted to save it, depending on your browser settings.

Troubleshooting a data task not using Data Movement gateway

You can get information about landing operations for data tasks that do not use Data Movement gateway by inspecting log files. You can also set the level of logging. Logs are available when the data task has completed its first run. You can view task logs and platform logs by clicking View task logs in a landing or replication task.

You can set which timespan to display with Task completion time:.

You can view the following logs:

Task logs

Information noteYou need Can operate permission in the space where the data task resides to view task logs.
- Task logs
- Source connection logs
Platform logs

Information noteYou need a Tenant Admin role to view platform logs.
- Data Movement gateway logs
- Replication engine logs
- Source connection logs
- SaaS application logs

You can scroll to the top and the bottom of the file with Arrow to scroll to top and Arrow to scroll to bottom .

The view of the log file is not updated automatically with the latest messages. To update, click

to scroll to the end of the log file refreshed with the latest messages.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here