Skip to main content Skip to complementary content

Data replication settings

You can configure settings for the replication data task.

  • Open the replication task and click Settings in the toolbar.

The Settings: <Task-Name> dialog opens. The available settings are described below.

Task modes

In this tab, you can set the replication mode and determine how changes will be applied to the target.

Replication mode

The following replication modes are available:

Information noteWhen replicating from SaaS application sources, the Full load replication mode is enabled by default and cannot be disabled.
  • Full load: Loads the data from the selected source tables to the target platform and creates the target tables if necessary. The full load occurs automatically when the task is started, but can also be performed manually should the need arise.
  • Apply changes: Keeps the target tables continually up-to-date with any changes made to the source tables.
  • Store changes: Stores the changes to the source tables in Change Tables (one per source table).

    For more information, see Store changes.

If you select Store changes or Apply changes and your source data contains tables that do not support CDC, or views, two data pipelines will be created. One pipeline for tables that support CDC, and another pipeline for all other tables and views that support Full load only.

Information noteALTER TABLE DDL operations are not currently supported. Other DDL operations such as DROP TABLE and TRUNCATE TABLE are supported.

Apply changes mode

Changes are applied to the target tables using one of the following methods:

  • Batch optimized: This is the default. When this option is selected, changes are applied in batches. A preprocessing action occurs to group the transactions into batches in the most efficient way.
  • Transactional: Select this option to apply each transaction individually, in the order it is committed. In this case, strict referential integrity is ensured for all tables.

Metadata

Basic

  • Target table schema: The schema on the target to which the source tables will be replicated if you do not want to use the source table schema (or if there is no schema in the source database).

    Information noteWhen replicating to an Oracle target, the default target table schema is "system". Note also that if you leave this field empty (in which case the source schema will be used), you must make sure that the source schema already exists on the target. Otherwise, the task will fail.
    Information noteThe maximum permitted length for the schema name is 128 characters.
  • Do not include LOB columns: Select this if you do not want source LOB columns to be replicated.
  • Limit LOB column size to (KB): This is the default. Select this option if you only need to replicate small LOBs or if the target platform does not support unlimited LOB size. The maximum permitted value for this field is 102400 KB (100 MB).

    When replicating small LOBs, this option is more efficient than the Do not limit LOB column size option since the LOBs are replicated "inline" as opposed to via "lookup" from the source. During Change Processing, small LOBs are usually replicated via "lookup" from the source.

    As the value of the Limit LOB size to is in bytes, the size should be calculated according to the following formulas:

    • BLOB: The length of the largest LOB.
    • NCLOB: The length of the longest TEXT in characters multiplied by two (as each character is handled as a double-byte). If the data includes 4-byte characters, multiply it by four.
    • CLOB: The length of the longest TEXT in characters (as each character is handled as a UTF8 character). If the data includes 4-byte characters, multiply it by two.
    Information note
    • Any LOBs larger than the specified size will be truncated.
    • During Change Processing from Oracle source, inline BLOBs are replicated inline.
    • Changes to this setting will only affect existing tables after they are reloaded.
  • Do not limit LOB column size: When this option is selected, LOB columns will be replicated, regardless of size.

    Information noteReplicating LOB columns may impact performance. This is especially true in the case of the large LOB columns which require Data Movement gateway to perform a lookup from the source table in order to retrieve the source LOB value.
    • Optimize when LOB size is less than: Select this option when you need to replicate both small and large LOBs, and most of the LOBs are small.

      Information note

      This option is supported with the following sources and targets only:

      • Sources: Oracle, Microsoft SQL server, MySQL, PostgreSQL, and IBM DB2 for LUW

      • Targets: Oracle, Microsoft SQL Server, MySQL, PostgreSQL.

      When this option is selected, during Full Load, the small LOBs will be replicated "inline" (which is more efficient), and the large LOBs will be replicated by performing a lookup from the source table. During Change Processing, however, both small and large LOBs will be replicated by performing a lookup from the source table.

      Information note

      When this option is selected, Data Movement gateway will check all of the LOB sizes to determine which ones to transfer "inline". LOBs larger than the specified size will be replicated using Full LOB mode.

      Therefore, if you know that most of the LOBs are larger than the specified setting, it is preferable to use the Replicate unlimited LOB columns option instead.

    • Chunk size (KB): Optionally, change the size of the LOB chunks to use when replicating the data to the target. The default chunk size should suffice in most cases, but if you encounter performance issues, adjusting the size may improve performance.

      Information note

      With some databases, data type validation occurs when the data is inserted or updated. In such cases, replication of structured data types (e.g. XML, JSON, GEOGRAPHY, etc.) may fail if the data is bigger than the specified chunk size.

Advanced

Control table settings

  • Control tables schema: Specify the target schema for the Control Tables if you do not want them to be created in the source schema (the default), or in the target schema.

    Information noteThe maximum permitted length for the schema name is 128 characters.
  • Create target control tables in tablespace: ​ When the target connector is Oracle, specify the tablespace where you want the target control tables to be created. If you do not enter any information in this field, the tables will be created in the default tablespace in the target database.
  • Create indexes for target control tables in tablespace:When the target connector is Oracle, specify the tablespace where you want the control table indexes to be created. If you do not enter any information in this field, the indexes will be created in the same tablespace as the control tables.
  • Replication history timeslot (minutes): The length of each time slot in the Replication history Control Table. The default is 5 minutes.

Control table selection

Select the Control Tables you want to be created on the target platform:

Logical name Name in target
Apply exceptions attrep_apply_exceptions
Replication status attrep_status
Suspended tables attrep_suspended_tables
Replication history attrep_history

For more information on Control Tables, see Control Tables.

Full load

Basic

If target table already exists: Select one of the following from the list to determine how you want to handle loading the target at full-load start up:

Information note

The option to drop or truncate the target tables is relevant only if such operations are supported by the source endpoint.

  • Drop and create table: The table is dropped and a new table is created in its place.

    Information note

    Data Movement gateway Control Tables will not be dropped. However, any suspended tables that are dropped will also be deleted from the attrep_suspended_tables Control Table if the associated task is reloaded.

  • TRUNCATE before loading: Data is truncated without affecting the table metadata. Note that when this option is selected, enabling the Create primary key or unique index after full load completes option will have no effect.
  • Ignore: Existing data and metadata of the target table will not be affected. New data will be added to the table.

Advanced

Performance tuning

The following performance tuning parameters can be set:

  • Maximum number of tables: Enter the maximum number of tables to load into the target at one time. The default value is 5.
  • Transaction consistency timeout (seconds): Enter the number of seconds that Data Movement gateway waits for transactions to close, if they are open when the task starts, before beginning the Full Load operation. The default value is 600 (10 minutes). Data Movement gateway will begin the full load after the timeout value is reached even if there are open transactions.

    Information noteTo replicate transactions that were open when Full Load started but were only committed after the timeout value was reached, you need to reload the target tables.
  • Commit rate during full load: The maximum number of events that can be transferred together. The default value is 10000.

After full load completes

You can set the task to stop automatically after Full Load completes. This is useful if you need to perform DBA operations on the target tables before the task’s Apply Changes (i.e. CDC) phase begins.

During Full Load, any DML operations executed on the source tables are cached. When Full Load completes, the cached changes are automatically applied to the target tables (as long as the Before/After cached changes are applied options described below are disabled).

  • Create primary key or unique index: Select this option if you want to delay primary key or unique index creation on the target until after full load completes.
  • Stop the task:
    Information noteThese settings are not available when replicating from SaaS application sources as there are no cached events.
    • Before cached changes are applied: Select to stop the task after Full Load completes.

    • After cached changes are applied: Select to stop the task as soon as data is consistent across all tables in the task.

    Information note

    When configuring the task to be stopped after Full Load completes, note the following:

    • The task does not stop the moment Full Load completes. It will be stopped only after the first batch of changes has been captured (as this is what triggers the task to stop). This might take a while depending on how frequently the source database is updated. After the task stops, the changes will not be applied to the target until the task is resumed.
    • Choosing Before cached changes are applied might impact performance, as the cached changes will only be applied to tables (even those that have already completed Full Load) after the last table completes Full Load.
    • When this option is selected and a DDL is executed on one of the source tables during the Full Load process (in a Full Load and Apply Changes task), Data Movement gateway will reload the table. This effectively means that any DML operations executed on the source tables will be replicated to the target before the task stops.

Apply changes

These settings are only available when the Apply changes replication mode is enabled.

Basic

Apply DDL changes to target table

Information noteThese settings are not available when replicating from SaaS application sources.

The following options determine whether DDL operations performed on the corresponding source table will also be performed on the target table.

  • Drop table: Select to drop the target table when the source table is dropped.
  • Truncate table: Select to truncate the target table when the source table is truncated.
  • Alter table: Select to alter the target table when the source table is altered.

Change processing tuning

  • Apply batched changes to multiple tables concurrently: Selecting this option might improve performance when applying changes from multiple source tables. 

    Information note

    This option is only supported when:

    • The target is Microsoft SQL Server.
    • The Apply changes mode is set to Batch optimized. For more information, see Task modes.
    • Maximum number of tables: The maximum number of tables to apply batched changes to concurrently. The default is five, the maximum is 50, and the minimum is two.

When the Apply batched changes to multiple tables concurrently option is enabled, the following limitations apply:

  • The task error handling policy defaults will be unchanged for Environmental and Table errors, but the defaults for Data errors and Apply Conflicts errors will be as follows:

    • Data Errors:
      • Data truncation errors: Log record to the exceptions table
      • Other data errors: Suspend table
    • Apply Conflicts Errors:
      • Deletes: Ignore record
      • Inserts: UPDATE the existing target record

      • Updates: Ignore record
    • Escalation Action:
      • The Escalation action for both Data errors and Apply Conflicts is not supported.
    • The attrep_apply_exception Control Table is not supported.
    • For data errors:
      • The Log record to the exceptions table option is only available for data truncation errors.
      • There is no Ignore option.
    • For Apply conflicts:
      • The Log record to exceptions table option is not available.
      • The Ignore option is only available for the No record found for applying an UPDATE apply conflict.

See also: Error handling.

Advanced

See Change processing tuning.

Store changes

These settings are only available when the Store changes replication mode is enabled.

Basic

DDL options

Information noteThese settings are not available when replicating from SaaS application sources.

Select one of the following options to determine how to handle DDL operations on the source tables:

  • Apply to change table: DDL operations to the source tables (such as a column being added) will be applied to the corresponding Change Tables only.
  • Ignore: All DDL operations to source tables will be ignored.

Advanced

On update

Information noteThis setting is not relevant for SaaS application sources as changes are applied as INSERT operations.

Select Store before and after image to store both the pre-UPDATE data and the post-UPDATE data. When not selected, only the post-UPDATE data will be stored.

Change table creation

The following section describes the options that are available when storing changes in Change Tables.

  • Suffix: Specify a string to use as the suffix for all Change Tables. The default value is __ct. The Change Table names are the name of the target table with the suffix appended. So for example, using the default value, the name of the Change Table will be HR__ct.
  • Header column prefix: Specify a string to use as the prefix for all of the Change Table header columns. The default value is header__. For example, when using the default value, the header column stream_position will be called header__stream_position.

For more information on Change Tables, see Using Change Tables.

If change table exists when full load starts: Select one of the following to determine how to load the Change Tables when Full Load replication starts:

  • Drop and create change table: The table is dropped and a new table is created in its place.
  • Delete old changes and store new changes in existing change table: Data is truncated and added without affecting the table metadata.
  • Keep old changes and store new changes in existing change table: Data and metadata of the existing Change table are not affected.

Table header columns

The Change Table header columns provide information about Change Processing operations such as the type of operation (e.g. INSERT), the commit time, and so on. If you do not need all of this information, you can configure the landing task to create the Change Tables with selected header columns (or none at all), thereby reducing their footprint in the target database..

Information noteYou cannot remove additional columns or restore columns while a task is running. To change your initial selection, you first need to stop the task, then modify your selection, and finally reload the target tables.

For a description of the header columns, see Change Tables.

Error handling

Basic

Apply conflicts

Duplicate key when applying INSERT: Select what action to take when there is a conflict with an INSERT operation.

Information noteWhen replicating from SaaS application sources, only the UPDATE the existing target record action is supported.
  • Ignore: The task continues and the error is ignored.

     

  • UPDATE the existing target record: The target record with the same primary key as the INSERTED source record is updated.

  • Log record to the exceptions table (default): The task continues and the error is written to the exceptions table.
  • Suspend table: The task continues but data from the table with the error record is moved into an error state and its data is not replicated.

  • Stop task: The task is stopped and manual intervention is required.

No record found for applying an UPDATE: Select what action to take when there is a conflict with an UPDATE operation.

Information noteThese settings are not available when replicating from SaaS application sources.
  • Ignore: The task continues and the error is ignored.
  • INSERT the missing target record: The missing target record will be inserted into the target table. When the source endpoint is Oracle, selecting this option requires supplemental logging to be enabled for all the source table columns.

  • Log record to the exceptions table (default): The task continues and the error is written to the exceptions table.
  • Suspend table: The task continues but data from the table with the error record is moved into an error state and its data is not replicated.

  • Stop task: The task is stopped and manual intervention is required.

Advanced

Information note

Data error handling is supported in Apply Changes replication mode only (not Full Load).

Data errors

For data truncation errors: Select what you want to happen when a truncation occurs in one or more specific records. You can select one of the following from the list:

  • Ignore: The task continues and the error is ignored.
  • Log record to the exceptions table (default): The task continues and the error is written to the exceptions table.
  • Suspend table: The task continues, but data from the table with the error record is moved into an error state and its data is not replicated
  • Stop task: The task is stopped and manual intervention is required.

For other data errors: Select what you want to happen when an error occurs in one or more specific records. You can select one of the following from the list:

  • Ignore: The task continues and the error is ignored.
  • Log record to the exceptions table (default): The task continues and the error is written to the exceptions table.
  • Suspend table: The task continues, but data from the table with the error record is moved into an error state and its data is not replicated
  • Stop task: The task is stopped and manual intervention is required.

Escalate error handling when other data errors reach (per table): Select this check box to escalate error handling when the number of non-truncation data errors (per table) reaches the specified amount. Valid values are 1-10,000.

Escalation action: Choose what should happen when error handling is escalated. Note that the available actions are dependent on the action selected from the For other data errors drop-down list described above.

  • Suspend table (default): The task continues, but data from the table with the error record is moved into an error state and its data is not landed.

    Information note

    The behavior differs according to the Change Processing Mode:

    • In Transactional apply mode, the last changes will not be landed.

    • In Batch optimized apply mode, a situation is possible where data will not be landed at all or only partially landed.

  • Stop task: The task is stopped and manual intervention is required.
  • Log record to the exceptions table: The task continues and the record is written to the exceptions table.

Table errors

When encountering a table error: Select one of the following from the drop-down list:

  • Suspend table (default): The task continues but data from the table with the error record is moved into an error state and its data is not replicated
  • Stop task: The task is stopped and manual intervention is required.

Escalate error handling when table errors reach (per table): Select this check box to escalate error handling when the number of table errors (per table) reaches the specified amount. Valid values are 1-10,000.

Escalation action: The escalation policy for table errors is set to Stop task and cannot be changed.

Apply conflicts

Information noteThese settings are not available when replicating from SaaS application sources.

No record found for applying a DELETE: Select what action is taken when there is a conflict with a DELETE operation.

  • Ignore: The task continues and the error is ignored.
  • Log record to the exceptions table: The task continues and the record is written to the exceptions table.
  • Suspend table: The task continues but data from the table with the error record is moved into an error state and its data is not replicated.
  • Stop task: The task is stopped and manual intervention is required.

Escalate error handling when apply conflicts reach (per table): Select this check box to escalate error handling when the number of apply conflicts (per table) reaches the specified amount. Valid values are 1-10,000.

Escalation action: Choose what should happen when error handling is escalated:

  • Log record to the exceptions table (default): The task continues and the error is written to the exceptions table.
  • Suspend table: The task continues, but data from the table with the error record is moved into an error state and its data is not replicated.

    Information note

    The behavior differs according to the Change Processing Mode:

    • In Transactional apply mode, the last changes will not be replicated

    • In Batch optimized apply mode, a situation is possible where there will be no replication of data or data replication will be partial.

  • Stop task: The task is stopped and manual intervention is required.

Environmental errors

  • Maximum retry count: Select this option and then specify the maximum number of attempts to retry a task when a recoverable environmental error occurs. After the task has been retried the specified number of times, the task is stopped and manual intervention is required.

    To never retry a task, clear the check box or specify "0".

    To retry a task an infinite number of times, specify "-1"

    • Interval between retry attempts (seconds): Use the counter to select or type the number of seconds that the system waits between attempts to retry a task.

      Valid values are 0-2,000.

  • Increase retry interval for long outages: Select this check box to increase the retry interval for long outages. When this option is enabled, the interval between each retry attempt is doubled, until the Maximum retry interval is reached (and continues retrying according to the specified maximum interval).
    • Maximum retry interval (seconds): Use the counter to select or type the number of seconds to wait between attempts to retry a task when the Increase retry interval for long outages option is enabled. Valid values are 0-2,000.

Change processing tuning

Transaction offload tuning

The following tuning options are available, regardless of which Change processing mode is selected:

  • Offload transaction in progress to disk if:

    Qlik Cloud Data Integration usually keeps transaction data in memory until it is fully committed to the source and/or target. However, transactions that are larger than the allocated memory or that are not committed within the specified time limit will be offloaded to disk.

    • Transaction memory size exceeds (MB): The maximum size that all transactions can occupy in memory before being offloaded to disk. The default value is 1024.
    • Transaction duration exceeds (seconds): The maximum time that each transaction can stay in memory before being offloaded to disk. The duration is calculated from the time that Qlik Cloud Data Integration started capturing the transaction. The default value is 60.

Miscellaneous tuning

  • Statements cache size (number of statements): The maximum number of prepared statements to store on the server for later execution (when applying changes to the target). The default is 50. The maximum is 200.
  • Store task recovery data in target database: Select this option to store task-specific recovery information in the target database. When this option is selected, Data Movement gateway creates a table named attrep_txn_state in the target database. This table contains transaction data that can be used to recover a task in the event that the files in the Data Movement gatewayData folder are corrupted or if the storage device containing the Data folder has failed.

Batch optimized apply

The following settings are only available when the Apply mode is set to Batch optimized only.

  • Apply batched changes in intervals:
    • Larger than: The minimum amount of time to wait between each application of batch changes. The default value is 1.

      Increasing the Longer than value decreases the frequency with which changes are applied to the target while increasing the size of the batches. This can improve performance when applying changes to target databases that are optimized for processing large batches.

    • Less than: The maximum amount of time to wait between each application of batch changes (before declaring a timeout). In other words, the maximum acceptable latency. The default value is 30. This value determines the maximum amount of time to wait before applying the changes, after the Longer than value has been reached.
  • Force apply a batch when processing memory exceeds (MB): The maximum amount of memory to use for pre-processing in Batch optimized apply mode. The default value is 500.

    For maximum batch size, set this value to the highest amount of memory you can allocate to Data Movement gateway. This can improve performance when applying changes to target databases that are optimized for processing large batches.

Transactional apply

The following settings are only available in when the Apply mode is set to Transactional.

  • Minimum number of changes per transaction: The minimum number of changes to include in each transaction. The default value is 1000.

    Information noteData Movement gateway applies the changes to the target either when the number of changes is equal to or greater than the Minimum number of changes per transaction value OR when the batch timeout value is reached (see below) - whichever occurs first. Because the frequency of changes applied to the target is controlled by these two parameters, changes to the source records may not immediately be reflected in the target records.
  • Maximum time to batch transactions before applying (seconds): The maximum time to collect transactions in batches before declaring a timeout. The default value is 1.

Character substitution

You can substitute or delete source characters in the target database and/or you can substitute or delete source characters that are not supported by a selected character set.

Information note
  • All characters must be specified as Unicode code points.

  • Character substitution will also be performed on the Control tables.
  • Invalid values will be indicated by a red triangle in the top right of the table cell. Hovering your mouse cursor over the triangle will show the error message.

  • Any table-level or global transformations defined for the task will be performed after the character substitution has been completed.

  • Substitutions actions defined in the Substitute or Delete Source Characters table are performed before the substitution action defined in the Substitute or Delete Source Characters Unsupported by the Selected Character Set table.

  • Character substitution does not support LOB data types.

Substituting or deleting source characters

Use the Substitute or delete source characters table to define replacements for specific source characters. This may be useful, for example, when the Unicode representation of a character is different on the source and target platforms. For example, on Linux, the minus character in the Shift_JIS character set is represented as U+2212, but on Windows it is represented as U+FF0D.

Substitution actions
To Do This

Define substitution actions.

  1. Click the Add character button above the table.

  2. Specify a source character and a target character in the Source character and Substitute character fields respectively.

    For example to replace the letter "a" with the letter "e", specify 0061 and 0065 respectively.

    Information note

    To delete the specified source character, enter 0 in the Substitute character column.

  3. Repeat steps 1-2 to replace or delete additional characters.

Edit the specified source or target character

Click at the end of the row and selected Edit.

Delete entries from the table

Click at the end of the row and selected Delete.

Substituting or deleting source characters unsupported by the selected character set

Use the Unsupported source characters by character set table to define a single replacement character for all characters not supported by the selected character set.

Unsupported character substitution actions
To Do This

Define or edit a substitution action.

  1. Select a character set from the Character set drop-down list in the table.

    Any characters not supported by the selected character set will be replaced on the target by the character specified in step 2 below.

  2. In the Substitute character column, click anywhere in the column and specify the replacement character. For example, to replace all unsupported characters with the letter "a", enter 0061.

    Information note

    To delete all unsupported characters, enter 0.

Disable the substitution action.

Select the blank entry from the Character Set drop-down list.

More options

These options are not exposed in the UI as they are only relevant to specific versions or environments. Consequently, do not set these options unless explicitly instructed to do so by Qlik Support or product documentation.

To set an option, simply copy the option into the Add feature name field and click Add. Then set the value or enable the option according to the instructions you received.

SaaS application

Information noteThe settings are relevant when replicating from SaaS application sources only.

For initial load

When moving data from a SaaS application source, you can set how to perform the initial full load:

Information noteThese settings require Data Movement gateway version 2022.11.74 or later.
Use cached data

This option lets you use cached data that was read when generating metadata with Full data scan selected.

This creates less overhead regarding API use and quotas, as the data is already read from the source. Any changes since the initial data scan can be picked up by Change data capture (CDC).

Load data from source

This option performs a new load from the data source. This option is useful if:

  • The metadata scan was not performed recently.

  • The source dataset is small and frequently changing, and you do not want to maintain a full history of changes.

Read changes every (Minutes)

Set the interval between reading changes from the source in minutes. The valid range is 1 to 1440.

Information noteThis option is only available for data tasks configured with the Change data capture (CDC) update method.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!