Incremental vs. snapshot base type in distribution tables
Qlik Catalog enables users to manage snapshot and incremental loads through partition administration in Hive or the distribution engine in use. When an entity is created the entity property, entity.base.type, is set designating whether the data loads into the entity through snapshot (default) where every data set loaded is a full "refresh" of all data for that entity or incremental where after the first load all subsequent data sets loaded are additions on top of already loaded data.
This setting drives the current view of the data exposed through distribution table; which in the case of snapshot is the partition of the last data that was loaded whereas for incremental loads; the current view is the aggregate of all data partitions.
To view which base type is configured for an entity, from entity grid in source module, select (view details) or View/Entity Properties from the More dropdown on the entity row to display General Information for the entity. Entity Base Type provides radio buttons for either Snapshot or Incremental. These buttons are informational and can be modified by administrators.
Querying records for an entity can yield significantly different record numbers depending on entity base type setting.
Entity Base Type |
Partitions |
SELECT * FROM SOURCE.ENTITY returns: |
SELECT * FROM SOURCE.ENTITY_HISTORY returns: |
---|---|---|---|
Incremental |
load A. 50 records (Partition 1) load B: 10 records (Partition 2) |
(Partition 1+2) Returns = 60 records
|
Partition 1+ 2= 60 records
|
Snapshot |
load A: 50 records (Partition 1) load B: 60 records (Partition 2) |
(Partition 2) Returns = 60 records
|
Partition 1 + 2= 110 records
|