Step 4: Create a Qlik Open Lakehouse project
Create a Qlik Open Lakehouse pipeline project to ingest data from any source. Store your data in Iceberg open table format.
Prerequisites
To create a Qlik Open Lakehouse project, you need:
-
A network integration to enable Qlik to provision and manage compute resources on your behalf.
-
A lakehouse cluster configured to run the data storage task within your Iceberg project.
-
A connection to a data catalog to use as the data target for your project, or the necessary details so you can create a new connection.
Supported tasks
The following tasks are supported in a Qlik Open Lakehouse project:
-
Lake landing data task
Land data in CSV format in S3, from any Qlik-supported source, including high-volume data streams.
-
Storage data task
The Storage data task consumes data landed in the cloud by the Lake landing task. The task write data into Iceberg tables for efficient storage and querying.
-
Mirror data task
Mirror Iceberg tables from your Qlik Open Lakehouse to Redshift or Snowflake. Users can query data via external tables without migrating data to your cloud data warehouse.
Example of creating a Qlik Open Lakehouse project
The following example creates a Qlik Open Lakehouse pipeline project, onboards data, and stores it in Iceberg format tables. This example creates a simple pipeline that you could expand by onboarding more data sources. You could add a Mirror data task to mirror your tables in Redshift or Snowflake without duplicating data, or use this project as the source for a project that requires transformations in your cloud data warehouse.
To create a Qlik Open Lakehouse project, do the following:
-
In Data Integration home, click Create pipeline, and configure it:
-
Name: Enter the name for the project.
-
Space: Select the space the project will belong to.
-
Description: Optionally, enter a description for the project.
-
For Use case, select Data pipeline.
-
Configure the Data platform:
-
Data platform: Select Qlik Open Lakehouse from the list.
-
Data catalog connection: In the list, select an existing connection or click Create new to add a new data catalog connection.
-
Landing target connection: Select the S3 bucket for landing the data or click Create new to add a new bucket location.
-
Storage compute cluster: Select the lakehouse cluster that will run the storage task.
-
Create the project.
-
Follow the steps in the onboarding data wizard. For more information, see Onboarding data .