Troubleshooting data tasks
This section describes problems that can occur when working with data tasks and how to troubleshoot.
Troubleshooting environmental errors
When a data task encounters an environmental error, for example, timeouts, network errors, or connection error, the data task will retry the operation automatically. If the error is not resolved after retrying, the data task stops running, and shows the status Error with an error message.
-
Landing tasks with data sources that are only accessible via Data Movement gateway:
The operation is retried an infinite number of times, with an interval of 5 seconds.
If the outage is long, the interval is doubled until an interval of 1800 seconds is reached.
-
Landing tasks with data sources that are accessible without Data Movement gateway, Storage tasks, Transform tasks and Data mart tasks:
The operation is retried 3 times, with an interval of 1 second.
Do the following:
-
Resolve the error using the error message.
-
Reload or resume operation of the data task.
Troubleshooting issues with a specific table
When a data task encounters an error while writing to a specific table, the data task will continue running. The table in error will show the status Error with an error message.
-
Resolve the error using the error message.
-
Reload the table that was in error.
Troubleshooting CDC issues
Landing data tasks with Full load & CDC update mode can encounter CDC related issues that affect the entire task, and that cannot be resolved by reloading specific tables. Examples of issues are missing events, issues caused by source database reorganization, or failure when reading source database events.
You can reload all tables to the target to resolve such issues.
- Stop the data task and all tasks that consume it.
-
Open the data task and select the Monitor tab.
-
Click ..., and then Reload target.
This will reload all tables to the target using Drop-Create, and will restart all change data capture from now.
-
Storage tasks that consume the landing data task will be reloaded via compare and apply at their next run to get in sync. Existing history will be kept. Type 2 history will be updated to reflect changes after the reload and compare process is executed.
The timestamp for the from date in the type 2 history will reflect the reload date, and not necessarily the date the change occurred in the source.
-
Storage live views will not be reliable during the reload target operation, and until the storage is in sync. Storage will be fully synced when:
-
All tables are reloaded using compare and apply,
-
One cycle of changes is performed for each table.
-
For more information, see Reloading all tables to the target.
NULL values in primary key columns
You may receive an error message when executing a data task: Unknown execution error - NULL result in a non-nullable column.
Possible cause
Columns used as a primary key must not contain NULL values, and should be non-nullable.
Proposed action
In the source data task, add an expression that converts all NULL values to a value, for example, 0.
You can also select another column to use as primary key.
Casting error when using Redshift as data platform
You may get the following error or similar when using Redshift as data platform: Failed to find conversion function from “unknown” to character varying
Possible cause
Missing casting of a constant expression. This may happen more frequently in data marts due to the higher complexity of the final query .
Proposed action
Cast the constant expression as text.
Example:
Ambiguous column names
When you register data based on a view created in a Qlik Talend Data Integration pipeline, the view may contain columns that were generated by Qlik Talend Data Integration. The names of these columns, starting with hdr__, are reserved. When a column with a reserved name is consumed in a storage task, the storage task will create columns with the same reserved name, leading to a naming conflict. For example, you can have two columns named hdr__key_hash.
For more information about reserved columns name in views, see Views.
Proposed action
Rename the column that comes from the registered data task in the storage data task. For example, rename hdr__key_hash to my__key_hash.
Troubleshooting a data task based on Data Movement gateway
You can get information about landing operations for data tasks based on Data Movement gateway by inspecting log files. You can also set the level of logging. Logs are available when the data task has completed its first run.
To view log files, you need one of the following permissions in the space where the data task resides:
-
Owner
-
Can operate
You also need one of the following permissions in the space where the data gateway resides:
- Can consume
- Can edit
- Can manage
Viewing log files
Open the log viewer by clicking View logs in a landing data task based on Data Movement gateway. You can select which log file to view under Replication engine logs. You can scroll to the top and the bottom of the file with and .
Setting logging options
You can set the level of logging for different operations of the replication under Logging options.
Storing trace and verbose logging in memory
When the logging level is set to "Trace" or "Verbose", you can instruct Qlik Talend Data Integration to store the logging information in memory until an error occurs. On detecting an error, Qlik Talend Data Integration will begin writing to the physical logs and continue to do so for a few minutes after the initial occurrence of the error.
If no error occurs before the allocated memory is used up, Qlik Talend Data Integration will empty the memory buffer and start afresh.
This option is useful for tasks that fail unpredictably and for no obvious reason. The problem with continually writing large amounts of information to the logs is twofold:
- Running in "Trace" or "Verbose" logging mode will quickly use up available disk space (unless the logging settings have been configured to prevent this).
- Continually writing large amounts of data to the logs will affect performance.
To use this option
- Select the Store trace/verbose logging in memory, but if an error occurs, write to the logs check box at the top of the tab.
- In the Allocate memory up to (MB) field, specify the amount of memory you want to allocate for storing logging information.
Setting logging levels
You can set the following levels:
-
1. Error
Show error messages.
-
2. Warning
Show warnings.
-
3. Info
Show informational messages.
-
4. Debug
Show additional information for troubleshooting purposes.
-
5. Detailed debug
Show detailed information for troubleshooting purposes.
The higher levels always include the messages from the lower levels. Therefore, if you select Error, only error messages are written to the log. However, if you select Info, informational messages, warnings, and error messages are included. In general, using the levels Debug and Detailed debug may generate large amounts of log data.
You can use Global to set the same level for all operations, or set the level individually for each operation.
-
Source - full load
Logs activity related to full load operations in the data source. This includes the SELECT statements executed against the source tables prior to full load.
-
Source - CDC
Logs activity related to CDC operations in the data source.
Warning noteSetting this to Detailed debug level will generate very large amounts of data to the log. -
Source – data
Detailed logging of data source activity related to full load and CDC operations.
-
Target - full load
Logs activity related to full load operations on the target.
-
Target - CDC
Logs activity related to CDC operations on the target.
-
Target – upload
Logs activity when files are transferred to the target.
-
Extended CDC
Detailed logging of CDC activity, such as synchronization and storage of transactions.
-
Performance
Logs latency values for source and target.
-
Metadata
Logs activity related to reading metadata, as well as metadata changes. Status of the replication task is also logged.
-
Infrastructure
Logs infrastructure information, file system operations, and task management.
-
Transformation
Logs information related to transformations that are performed.
Downloading diagnostic files
You can download a diagnostic package, task logs, and a memory report to assist you when troubleshooting the replication task associated with the landing task. You can only download one item at a time.
-
In the log viewer, click Download to expand.
-
Select the item to download.
-
Click Download.
The file will either be downloaded to your computer or you will be prompted to save it, depending on your browser settings.
Troubleshooting a data task not using Data Movement gateway
You can get information about landing operations for data tasks that do not use Data Movement gateway by inspecting log files. You can also set the level of logging. Logs are available when the data task has completed its first run. You can view task logs and server logs.
Viewing task logs
To view task log files, you need one of the following permissions in the space where the data task resides:
-
Owner
-
Can operate
You also need one of the following permissions in the space where the data gateway resides:
- Can consume
- Can edit
- Can manage
Open the log viewer by clicking View task logs in a landing data task not using Data Movement gateway. You can select which log file to view under Replication engine logs. You can scroll to the top and the bottom of the file with and .
You can set the level of logging for different operations of the replication under Logging options. For more information, see Setting logging options.
Viewing data movement logs
To view data movement logs, you need a Data Admin or Tenant Admin role.
Open the log viewer by clicking View data movement logs in a landing data task not using Data Movement gateway. For more information about the logs, see Viewing and downloading log files.