Skip to main content Skip to complementary content

Landing streaming data to Qlik Open Lakehouse

Data lands in Amazon S3, ready for the Streaming Transform task to convert it into the Iceberg open table format. You can land data from any streaming source supported by Qlik.

Landing streaming data to a Qlik Open Lakehouse requires a pre-configured Amazon S3 bucket. Qlik Open Lakehouse is specifically optimized for high-volume, data sources, and compatible with all Qlik-supported streaming data sources. For more information on supported streaming sources, see Connecting to data streams.

Raw data lands in Avro format in S3 and the Streaming Transform task converts the data to Iceberg format. The Iceberg specification enables data to be queried from any engine that natively supports Trino SQL, for example Amazon Athena, Ahana, or Starburst Enterprise. Optionally, tables can be mirrored to your cloud data warehouse where they can be queried without duplicating data.

Landing data to a Qlik Open Lakehouse is available in projects with an AWS Glue Data Catalog target connection.

Preparations

  • A storage lakehouse cluster is required to run the ingestion and must be configured prior to creating your project.

  • Although you can configure your source and target connection settings in the setup wizard, to simplify the setup procedure, it is recommended to do this before you create the task.

  • To mirror data to your cloud data warehouse, you must first create a Qlik Open Lakehouse project to ingest your data and store it using the Iceberg open table format. You can add a Mirror data task after the Streaming Transformation task. For more information, see Mirroring data to a cloud data warehouse.

Creating a Streaming landing task

To create a Streaming landing task, do the following to first create the project:

  1. Create a project, and select Data pipeline in Use case.

  2. Select Qlik Open Lakehouse in Data platform and establish a connection to the data catalog.

  3. Set up a storage area in Landing target connection.

  4. Select the Storage lakehouse cluster for performing the ingestion and optimization of the data.

  5. Click Create to create the project.

When you onboard data or create a landing task in the project, a Streaming landing task is created instead of a Landing task. Streaming landing tasks operate and behave similar to a Landing task, except that they land data to cloud storage from streaming sources. For more information, see Connecting to data streams.

All files are landed in Avro format. After landing data is updated, the Streaming Transformation task consumes the landing data and updates the external tables.

Settings

For more information about task settings, see Streaming lake landing settings

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!