Full Load Settings
Click the Full Load Settings sub-tab to configure the following:
Full Load Processing is ON/OFF.
Click this button to toggle full load on or off. The initial setting is determined when Adding tasks.
When full load is ON, Qlik Replicate loads the initial source data to the target endpoint.
Full load can be turned on or off at any stage even if change processing is on. Once the task begins to process changes, the full load on/off switch is used only as additional protection against accidental or unauthorized reload.
Target table preparation
If target table already exists: Select one of the following from the list to determine how you want to handle loading the target at full-load start up:
The option to drop or truncate the target tables is relevant only if such operations are supported by the source endpoint.
-
DROP and Create table: The table is dropped and a new table is created in its place.
Information noteReplicate Control Tables will not be dropped. However, any suspended tables that are dropped will also be deleted from the attrep_suspended_tables Control Table if the associated task is reloaded.
- TRUNCATE before loading: Data is truncated without affecting the table metadata. Note that when this option is selected, enabling the Create primary key or unique index after full load completes option will have no effect.
-
ARCHIVE and CREATE table: A copy of the existing table will be saved to the same schema before the new table is created. The archived table name will be appended with a timestamp, indicating when the archiving operation occurred (e.g. Customers_20170605175601).
Information noteCurrently this option is only available for the Hadoop target endpoint.
- Do nothing: Existing data and metadata of the target table will not be affected. New data will be added to the table.
Replicate expects the source column data types to be compatible with the corresponding target column data types. If you choose either TRUNCATE before loading or Do nothing and one or more target data types are different than the data types for the corresponding source columns, use a transformation to convert the data types as required.
For information on creating data type transformations, see Defining transformations for a single table/view.
Primary Key or Unique Index Creation
Create primary key or unique index after full load completes: Select this option if you want to delay primary key or unique index creation on the target until after full load completes.
Stopping the Task after Full Load
After Full Load completes, stop the task: You can set the task to stop automatically after Full Load completes. This is useful if you need to perform DBA operations on the target tables before the task’s Apply Changes (i.e. CDC) phase begins.
During Full Load, any DML operations executed on the source tables are cached. When Full Load completes, the cached changes are automatically applied to the target tables (as long as the Before/After cached changes have been applied option(s) described below are disabled).
This feature is not available for bidirectional replication tasks.
Select Before cached changes have been applied to stop the task before the cached changes are applied and/or After cached changes have been applied to stop the task after the cached changes are applied.
Selecting the Before cached changes have been applied option will stop the task after Full Load completes. Selecting the After cached changes have been applied option will stop the task as soon as data is consistent across all tables in the task.
When configuring Replicate to stop the task after Full Load completes, note the following:
- The task does not stop the moment Full Load completes. It will be stopped only after the first batch of changes has been captured (as this is what triggers the task to stop). This might take a while depending on how frequently the source database is updated. After the task stops, the changes will not be applied to the target until the task is resumed.
- The task will stop after Full Load completes, even if there are no cached changes to apply.
-
The After cached changes have been applied option is not supported with all file-based and Hadoop-based target endpoints, namely:
- File-based: File, Amazon S3, Microsoft Azure ADLS, and Google Storage.
- Hadoop-based: Hadoop, Hortonworks Data Platform, Amazon EMR, Microsoft Azure HDInsight , Google Dataproc, Cloudera Data Platform (CDP) Private Cloud, and Microsoft Azure Databricks.
- Choosing to stop the task before cached changes have been applied may adversely affect performance, since the cached changes will only be applied to tables (even those that have already completed Full Load) after the last table completes Full Load.
- When the Before/After cached changes have been applied option is selected and a DDL is executed on one of the source tables during the Full Load process (in a Full Load and Apply Changes task), Replicate will reload the table. This effectively means that any DML operations executed on the source tables will be replicated to the target before the task stops.
-
When working with the File Channel endpoint, these options should be set in the remote File Channel task and not in the local File Channel task.
For more information on the File Channel endpoint, see Using the Qlik Replicate file channel.
Duplicate Record Prevention
Supported when using the IBM DB2 for z/OS and IBM DB2 for iSeries source endpoints only.
Select the Eliminate creation of duplicate records on full load option if you need to prevent duplicate records from being replicated during Full Load. You can either set the option at task level or per table.
Note that selecting this option could impact performance as Replicate instructs the source database to return the table records by Primary Key order and then removes any duplicate records.
For information on preventing creation of duplicate records at table level, see Full Load .