Click the Change Processing Tuning sub-tab to fine-tune the Apply Changes settings.
Determine which method will be used to apply changes.
Changes to tables without a Unique Index or Primary Key will always be applied in Transactional apply mode.
Applying cached events in transactional mode to endpoints that do not enforce constraints (such as Vertica and IBM Netezza), may result in duplicate records on the target. This is because such endpoints do not return duplicate errors.
Batch optimized apply: Select this to commit the changes in batches. In this case, a pre-processing action occurs to group the transactions into batches in the most efficient way. This may affect transactional integrity. Therefore, you must select one of the following to determine how the system will handle referential integrity issues:Information note
In the event of a recoverable error during the Batch optimized apply process, multiple attrep_changes (Net Changes) tables might be created in the target database (each with a unique name). These tables will need to be deleted manually as the automatic cleanup process will not delete them.
These options are not displayed in bidirectional tasks since such tasks always use the "Preserve transactional integrity" option.
The following target endpoints do not support applying binary data types in Batch optimized apply mode:
ODBC, SAP Sybase IQ, SAP Sybase ASE, Vertica, IBM Netezza, Teradata, and Amazon Redshift.
When LOB columns are included in the replication, Batch optimized apply can only be used with the Limit LOB size to option. For more information about including LOB columns in the replication, see Metadata.
Using Batch optimized apply to apply changes to tables with foreign keys is not supported.
The following options are available when Batch optimized apply is selected as the Change Processing Mode:
- Apply batched changes in intervals:
Longer than: The minimum amount of time to wait between each application of batch changes. The default value is 1.
Increasing the Longer than value decreases the frequency with which changes are applied to the target while increasing the size of the batches. This can improve performance when applying changes to target endpoints that are optimized for processing large batches, such as Teradata, Vertica, and Pivotal Greenplum.
- But less than: The maximum amount of time to wait between each application of batch changes (before declaring a timeout). In other words, the maximum acceptable latency. The default value is 30. This value determines the maximum amount of time to wait before applying the changes, after the Longer than value has been reached.
Force apply a batch when processing memory exceeds (MB): The maximum amount of memory to use for pre-processing in Batch optimized apply mode. The default value is 500.
For maximum batch size, set this value to the highest amount of memory you can allocate to Qlik Replicate. This can improve performance when applying changes to target endpoints that are optimized for processing large batches, such as Teradata, Vertica, and Pivotal Greenplum.
- Apply batched changes to multiple tables concurrently: Selecting this option should improve performance when applying changes from multiple source tables.
- Maximum number of tables: The maximum number of tables to apply batched changes to concurrently. The default is five, the maximum is 50, and the minimum is two.
When the Apply batched changes to multiple tables concurrently option is enabled, the following limitations apply:
Supported by the following target endpoints only: Snowflake on Azure, Snowflake on AWS, Snowflake on Google, Microsoft SQL Server, Amazon Redshift, Microsoft Azure Databricks Delta, Microsoft Azure Synapse Analytics, and Microsoft Azure SQL Database.
Error handling limitations:
The global error handling policy will be unavailable.
The task error handling policy defaults will be unchanged for Environmental and Table errors, but the defaults for Data errors and Apply Conflicts errors will be as follows:
- Data truncation errors: Ignore record
- Other data errors: Suspend table
Apply Conflicts Errors:
- Deletes: Ignore record
Inserts: UPDATE the existing target record
Note that this is not relevant for a Snowflake target (as Snowflake does not support Primary Keys).
- Updates: Ignore record
- The Escalation action for both Data errors and Apply Conflicts is not supported.
- Control Table limitations:
- The attrep_apply_exception Control Table is not supported.
Limit the number of changes applied per change processing statement to: To limit the number of changes applied in a single change processing statement, select this check box and then optionally change the default value. The default value is 10,000.Information note
This option is not available for tasks configured with the following target endpoints:
- Microsoft SQL Server
- Microsoft Azure SQL Database
- Databricks Lakehouse (Delta) (When using Microsoft Azure Data Lake Storage (ADLS) Gen2 storage)
- Microsoft Azure Synapse Analytics
- Google Cloud SQL for SQL Server
- Amazon Redshift
- Snowflake on AWS
- Snowflake on Azure
- Snowflake on Google
The following options are available when Transactional apply is selected as the Change Processing Mode:
Minimum number of changes per transaction: The minimum number of changes to include in each transaction. The default value is 1000.Information note
Replicate applies the changes to the target either when the number of changes is equal to or greater than the Minimum number of changes per transaction value OR when the batch timeout value is reached (see below) - whichever occurs first. Because the frequency of changes applied to the target is controlled by these two parameters, changes to the source records may not immediately be reflected in the target records.
- Maximum time to batch transactions before applying (seconds): The maximum time to collect transactions in batches before declaring a timeout. The default value is 60.
Transaction offload tuning
The following tuning options are available, regardless of which Change processing mode is selected:
Offload transaction in progress to disk if:
Qlik Replicate usually keeps transaction data in memory until it is fully committed to the source and/or target. However, transactions that are larger than the allocated memory or that are not committed within the specified time limit will be offloaded to disk.
- Transaction memory size exceeds (MB): The maximum size that all transactions can occupy in memory before being offloaded to disk. The default value is 1000.
- Transaction duration exceeds (seconds): The maximum time that each transaction can stay in memory before being offloaded to disk. The duration is calculated from the time that Qlik Replicate started capturing the transaction. The default value is 60.
- Statements cache size (number of statements): The maximum number of prepared statements to store on the server for later execution (when applying changes to the target). The default is 50. The maximum is 200.
Store task recovery data in target database: Select this option to store task-specific recovery information in the target database. When this option is selected, Replicate creates a table named attrep_txn_state in the target database. This table contains transaction data that can be used to recover a task in the event that the files in the Data folder are corrupted or if the storage device containing the Data folder has failed.
For more information about this option, see Recovering from data folder loss or corruption.
DELETE and INSERT when updating a primary key column: For tasks configured with streaming target endpoints (for example, Kafka), the DELETE+INSERT statements will be executed on the actual target. For all other target endpoints, they will be executed in the associated Change Tables. This option requires full supplemental logging to be turned on in the source database.
Apply changes using SQL MERGE - When this option is not selected, the Batch optimized apply operation executes separate bulk INSERT, UPDATE and DELETE statements for each of the different change types in the Replicate Net Changes table.Tip noteThe Replicate Net Changes table is created on the target when working in Batch optimized apply mode and contains records that were changed on the source. It is truncated each time the source changes are applied to the target. For more information on the Net Changes table, see Net Changes table
While this method is highly efficient, enabling the Apply changes using SQL MERGE option is even more efficient when working with endpoints that support this option.
This is due to the following reasons:
- It reduces the number of SQL statements run per table from three to one. Most UPDATE operations in large, immutable, file-based cloud databases (such as Google Cloud BigQuery), involve rewriting of affected files. With such operations, the reduction of per-table SQL statements from three to one is very significant.
- The target database only needs to scan the Replicate net changes table once, significantly reducing I/O.
Optimize inserts: When Apply changes using SQL MERGE is selected together with this option and the changes consist of INSERTs only, Replicate will perform INSERTs instead of using SQL MERGE. Note that while this will usually improve performance and thereby reduce costs, it might also result in duplicate records in the target database.
- The Apply changes using SQL MERGE and Optimize inserts options are available for tasks configured with the following target endpoints only:
- Google Cloud BigQuery
- Databricks Lakehouse (Delta)
- Snowflake on Azure, Snowflake on Google, and Snowflake on Amazon
- The Apply changes using SQL MERGE and Optimize inserts options are not supported with the following source endpoints:
For more information, see the "Limitations and Considerations" topic for these endpoints.
When the Apply changes using SQL MERGE option is enabled (either alone or together with the Optimize inserts option):
- Non-fatal data errors or data errors that cannot be recovered will be handled as table errors
The Global error handling policy will not be available
- The Apply Conflicts error handling policy will be preset and read-only. If the Optimize inserts option is also selected, the Duplicate key when applying INSERT Apply Conflicts error-handling option will be set to Allow duplicates in the target.
- Some of the Data error handling policy options will not be available.
- The operations will only be performed on the final target tables. For Change Tables or the Audit Table, INSERTs will be performed.