Storing streaming datasets
The following Streaming Transformation task settings apply to Qlik Open Lakehouse projects using a streaming source.
You can store and transform streaming data using the Streaming Transform data task. Streaming data often contains nested structures and arrays that require flattening, and transformation capabilities are needed during the storage phase. These capabilities are available to the Streaming Transformation task, enabling you to apply transformations immediately after landing your streaming data.
Storage settings
You can set properties for the Streaming Transform data task when the data platform is Qlik Open Lakehouse.
-
Click Settings.
General settings
-
Task schema
You can change the name of the Streaming Transform task schema. Default name is the name of the storage task.
-
Internal schema
You can change the name of the internal storage data asset schema. Default name is the name of the storage task with _internal appended.
- Prefix for all tables and views
You can set a prefix for all tables and views created with this task.
Information noteYou must use a unique prefix when you want to use a database schema in several data tasks. -
Folder to use
You can change the Streaming Transform task storage folder.
-
Load settings for new datasets
-
Append only
Adds new records without modifying existing data. Key constraints are not enforced if duplicate records arrive.
-
Apply changes (Merge)
Updates existing records and inserts new records based on key fields.
If you select to merge changes, you can also select the following:
-
Soft delete records by providing deletion expression
Define a deletion expression to mark records for deletion.
-
Keep historical records (Type 2)
Keep previous versions of changed records.
-
-
-
Column unnesting
-
Preserve nested columns
Select to preserve nested data.
-
Unnest into separate columns
The default behavior is to unnest into separate columns.
-
-
Target tables partition
-
No partition
New tables are created without partitions.
-
Partition by event date
New tables are partitioned by the date events are ingested.
-
Runtime settings
-
Lakehouse cluster
You can change the lakehouse cluster, but this must support streaming workloads or mixed workloads.
Schema evolution settings
-
Add columns on root level
This setting applies when new columns are added to the streaming landing task at the root level.
-
Apply to target
Automatically adds new root level columns from the Streaming landing task to the Streaming Transform task. This is the default setting.
-
Ignore
Does not add new root level columns.
-
Stop task
Stops the transform task if a new root level column is detected in the streaming landing task.
-
-
Add columns to structures
This setting applies when new fields are added inside an existing nested structure in the streaming landing task.
- Apply to target
Automatically adds new fields to existing structures in the Streaming Transform task if they are added to the landing structure.
-
Ignore
Does not add new fields to existing structures.
-
Stop task
Stops the transform task if a new field is added to a structure in the Streaming landing task.
- Apply to target
-
Change field data type
- Ignore
Does not change the data type.
-
Stop task
Stops the transform task if a data type change is detected in the Streaming landing task.
- Ignore