The Merge prefix can be added to any LOAD or SELECT statement in the script to specify that the loaded table should be merged into another table. It also specifies that this statement should be run in a partial reload.
The typical use case is when you load a change log and want to use this to apply inserts, updates, and deletes to an existing table.
Merge [only] [(SequenceNoField [, SequenceNoVar])] On ListOfKeys [Concatenate [(TableName)]] (loadstatement | selectstatement)
During a normal (non-partial) reload, the Merge LOAD construction will work as a normal Load statement but with the additional functionality of removing older obsolete records and records marked for deletion. The first field of the load statement must hold information about the operation: Insert, Update, or Delete.
For each loaded record, the record identifier will be compared with previously loaded records, and only the latest record (according to the sequence number) will be kept. If the latest record is marked with Delete, none will be kept.
If the Concatenate prefix is used, or if there already exists a table with the same set of fields, this will be the relevant table to modify. Otherwise, the result of the Merge LOAD construction will be stored in a new table.
If the Concatenate prefix is used, the resulting table may have more fields than the change log that is used as input to the Merge.
A partial reload will do the same as a full reload. The only difference is that the Merge LOAD construction will never create a new table. There always exists a relevant table from the previous script execution to modify.
If the loaded change log is an accumulated log, that is it contains changes that already have been loaded, the parameter SequenceNoVar can be used in a Where clause to limit the amount of input data. The Merge LOAD could then be made to only load records where the field SequenceNoField is greater than SequenceNoVar. Upon completion, the Merge LOAD will assign a new value to the SequenceNoVar with the maximum value seen in the SequenceNoField field.
|only||An optional qualifier denoting that the statement should be executed only during partial reloads. It should be disregarded during normal (non-partial) reloads.|
|SequenceNoField||The name of the field containing a timestamp or a sequence number that defines the order of the operations.|
The name of the variable that gets assigned the maximum value for SequenceNoField of the table being merged.
|ListOfKeys||A comma separated list of field names specifying the primary key.|
|operation||The first field of the load statement must contain the operation: Insert, Update, or Delete. ‘i’, ‘u’ and ‘d’ are also accepted.|
Example: Data load script
In this example, an inline table named Persons is loaded with four rows. Merge then changes the table as follows:
- Adds the row, Mary, 4
- Deletes the row, Steven, 3
- Assigns the number 5 to Jake
The LastChangeDate variable will be set to the maximum value in the ChangeDate column after Merge is executed.
Prior to the Merge load, the resulting table appears as follows:
Following the Merge load, the table appears as follows:
When the data is loaded, the Data load progress dialog box shows the operations that are performed:
Example: Data load script - Partial reload
In the following example, the Only argument specifies that the Merge command will only be executed during a partial reload. Updates will be filtered based on the previously captured LastChangeDate. After Merge is finished, LastChangeDate variable will be assigned the maximum value of the ChangeDate column processed during the merge