Click the Change Processing Tuning sub-tab to fine-tune the Apply Changes settings.
Determine which method will be used to apply changes.
Changes to tables without a Unique Index or Primary Key will always be applied in Transactional apply mode.
Applying cached events in transactional mode to endpoints that do not enforce constraints (such as Vertica and IBM Netezza), may result in duplicate records on the target. This is because such endpoints do not return duplicate errors.
Batch optimized apply: Select this to commit the changes in batches. In this case, a pre-processing action occurs to group the transactions into batches in the most efficient way. This may affect transactional integrity. Therefore, you must select one of the following to determine how the system will handle referential integrity issues:Note:
In the event of a recoverable error during the Batch Optimized Apply process, multiple attrep_changes tables may be created in the target database (each with a unique name). These tables will need to be deleted manually as the automatic cleanup process will not delete them.
These options are not displayed in bidirectional tasks since such tasks always use the "Preserve transactional integrity" option.
The following target endpoints do not support applying binary data types in Batch Optimized Apply mode:
ODBC, SAP Sybase IQ, SAP Sybase ASE, Vertica, IBM Netezza, Teradata, and Amazon Redshift.
When LOB columns are included in the replication, Batch optimized apply can only be used with the Limit LOB size to option. For more information about including LOB columns in the replication, see Metadata.
Using Batch Optimized Apply to apply changes to tables with foreign keys is not supported.
The following options are available when Batch optimized apply is selected as the Change Processing Mode:
- Apply batched changes in intervals:
Longer than: The minimum amount of time to wait between each application of batch changes. The default value is 1.
Increasing the Longer than value decreases the frequency with which changes are applied to the target while increasing the size of the batches. This can improve performance when applying changes to target endpoints that are optimized for processing large batches, such as Teradata, Vertica, and Pivotal Greenplum.
- But less than: The maximum amount of time to wait between each application of batch changes (before declaring a timeout). In other words, the maximum acceptable latency. The default value is 30. This value determines the maximum amount of time to wait before applying the changes, after the Longer than value has been reached.
Force apply a batch when processing memory exceeds (MB): The maximum amount of memory to use for pre-processing in Batch optimized apply mode. The default value is 500.
For maximum batch size, set this value to the highest amount of memory you can allocate to Qlik Replicate. This can improve performance when applying changes to target endpoints that are optimized for processing large batches, such as Teradata, Vertica, and Pivotal Greenplum.
- Apply batched changes to multiple tables concurrently: Selecting this option should improve performance when applying changes from multiple source tables.
- Maximum number of tables: The maximum number of tables to apply batched changes to concurrently. The default is five, the maximum is 50, and the minimum is two.
When the Apply batched changes to multiple tables concurrently option is enabled, the following limitations apply:
Supported with the Snowflake on Azure, Snowflake on AWS, Microsoft SQL Server, Amazon Redshift, and Microsoft Azure SQL Database target endpoints only.
Error handling limitations:
The global error handling policy will be unavailable.
The task error handling policy defaults will be unchanged for Environmental and Table errors, but the Data errors and Apply Conflicts error handling options will be set as follows:
- Data truncation errors: Suspend table
- Other data errors: Suspend table
Apply Conflicts Errors:
- Deletes: Ignore record
Inserts: UPDATE the existing target record
Note that this is not relevant for a Snowflake target (as Snowflake does not support Primary Keys).
- Updates: Ignore record
- The Escalation action for both Data errors and Apply Conflicts is not supported.
- Control Table limitations:
- The attrep_apply_exception Control Table is not supported.
Limit the number of changes applied per change processing statement to: To limit the number of changes applied in a single change processing statement, select this check box and then optionally change the default value. The default value is 10,000.Note:
This option is not available for tasks configured with a Snowflake target endpoint.
The following options are available when Transactional apply is selected as the Change Processing Mode:
Minimum number of changes per transaction: The minimum number of changes to include in each transaction. The default value is 1000.Note:
Replicate applies the changes to the target either when the number of changes is equal to or greater than the Minimum number of changes per transaction value OR when the batch timeout value is reached (see below) - whichever occurs first. Because the frequency of changes applied to the target is controlled by these two parameters, changes to the source records may not immediately be reflected in the target records.
- Maximum time to batch transactions before applying (seconds): The maximum time to collect transactions in batches before declaring a timeout. The default value is 60.
Transaction offload tuning
The following tuning options are available, regardless of which Change processing mode is selected:
Offload transaction in progress to disk if:
Qlik Replicate usually keeps transaction data in memory until it is fully committed to the source and/or target. However, transactions that are larger than the allocated memory or that are not committed within the specified time limit will be offloaded to disk.
- Transaction memory size exceeds (MB): The maximum size that all transactions can occupy in memory before being offloaded to disk. The default value is 1000.
- Transaction duration exceeds (seconds): The maximum time that each transaction can stay in memory before being offloaded to disk. The duration is calculated from the time that Qlik Replicate started capturing the transaction. The default value is 60.
- Statements cache size (number of statements): The maximum number of prepared statements to store on the server for later execution (when applying changes to the target). The default is 50. The maximum is 200.
Store task recovery data in target database: Select this option to store task-specific recovery information in the target database. When this option is selected, Replicate creates a table named attrep_txn_state in the target database. This table contains transaction data that can be used to recover a task in the event that the files in the Data folder are corrupted or if the storage device containing the Data folder has failed.
For more information about this option, see Recovering from data folder loss or corruption.
Apply changes using SQL MERGE - When this option is not selected, the Batch Optimized Apply operation executes separate bulk INSERT, UPDATE and DELETE statements for each of the different change types in the Replicate NET table. It then executes a single MERGE statement to apply the entire set of NET table changes to the target tables. While this method is highly efficient, enabling the Apply changes using SQL MERGE option is even ore efficient when working with endpoints that support this option.
This is due to the following reasons:
- It reduces the number of SQL statements run per table from three to one. Most UPDATE operations in large, immutable, file-based cloud databases (such as Google Cloud BigQuery), involve rewriting of affected files. With such operations, the reduction of per-table SQL statements from three to one is very significant.
- The target database only needs to scan the NET table once, significantly reducing I/O.
This option is currently in beta.
- When this option is enabled, non-fatal data errors or data errors that cannot be recovered will be handled as table errors.
- This option is not supported with the Salesforce and Oracle source endpoints. For more information, see the "Limitations and Considerations" topic for these endpoints.
- This option is available for tasks configured with the Google Cloud BigQuery target endpoint only.
- When this option is enabled, both the global error handling policy and the task-level error handling policy for Apply Conflicts will be unavailable.