Introducing Qlik Talend Data Integration
You can deliver data ready for consumption to Qlik Cloud or to cloud data warehouses, such as Snowflake, Google Cloud BigQuery, and Azure Synapse Analytics with Qlik Talend Data Integration. Data sources can be on-premises or in the cloud. The data can be kept up-to-date without manual intervention using CDC (Change Data Capture) or batch technologies, such as scheduled reloads. You can create a data pipeline and perform fit-for-purpose transformations and create data marts.
You can access Qlik Talend Data Integration home by selecting Data Integration from the launcher menu ().
For more information about the architecture of Qlik Talend Data Integration, see Dataset architecture in a cloud data warehouse.
Subscription options
Qlik Talend Cloud subscriptions are based on a capacity model with the volume of Data Moved as the primary value meter.
Qlik Talend Cloud is available in subscription options from four tiers: Starter, Standard, Premium, and Enterprise. The higher editions provide more advanced data sources and transformations. This includes capabilities hosted on Qlik Cloud and Talend Cloud. All subscriptions include Qlik Cloud Analytics Standard.
For more information about subscription options, see Qlik Talend Cloud subscription options.
Data spaces
Data spaces are governed areas of your Qlik Cloud tenant that are used to create and store projects. Inside the space, you can also create new connections using connectors, and manage access to Data Movement gateways. All data assets will be created in the space of the project that they belong to.
For more information, see Working in spaces in Qlik Talend Data Integration.
Projects
A project is where you create your data integration flow, using data tasks. The project is associated with a data platform that is used as target for all output. You can create a project with either of the following use cases:
-
Data pipeline
Create a simple linear pipeline, or a complex pipeline consuming several data sources and generating many outputs.
-
Replication
Replicate data from supported data sources to any supported target, or land data to a data lake.
Data task
A data task is the main unit of work in a project. You can create data tasks of the following types in a project. You create a new data task by clicking on Add new in the top bar, and then clicking the appropriate task.
Data tasks in data pipeline projects
-
Landing
Copy data from a data source to a landing area. Data sources can be on-premises or in the cloud. The landing area can be a cloud target, or an Amazon S3 data bucket (only when creating QVD datasets).
You can keep data up-to-date without manual intervention by using CDC, or by performing full loads that are scheduled to reload periodically.
-
Registered data
Register data that already exists on the data platform. This lets you use data that is onboarded with other tools than Qlik Talend Data Integration, for example, Qlik Replicate.
-
Storage
Create ready to consume datasets in a cloud data warehouse, or in Qlik Cloud, from the data copied by the landing data task. The datasets can be kept up-to-date with the landing data without manual intervention.
-
Transform
Create reusable data transformations based on rules and custom SQL as a part of your data pipeline. You can perform row-level transformations and create datasets that are either materialized as tables, or created as views that perform transformations on the fly.
-
Data mart
Create data marts to leverage your Storage data tasks or Transform data tasks. You can create any number of data marts depending on your business needs. Ideally, your data marts should contain repositories of summarized data collected for analysis on a specific section or unit within an organization.
Data tasks in replication projects
-
Replication
Replicate data from supported data sources to any supported target.
-
Lake landing
Land data to a data lake.
Landing data in a data lake with a Standard, Premium, or Enterprise subscription
Monitoring your data tasks
You can monitor the status and progress of your data tasks with monitor views. A monitor view lets you view the status of all data tasks in the tenant, or a subset of data tasks based on a filter. You can create several views to monitor different aspects of your data pipelines. For more information, see Monitoring and operating your data tasks.
Data products
Datasets that have been registered from a data project or a manual upload and added to your Catalog, can then be grouped and packaged in the form of a data product. You can group datasets by business domain for example, and make them available on the data marketplace for analytics consumers to use in apps. For more information, see Working with data products.
Connections
Connections are used to let data tasks access data sources, external storage and target platforms for data delivery and push-down transformations.
Managing your connections
Click Connections on the left to view all your connections.
-
You can edit connections that you own.
Information noteYou can also edit all connections in a data space where you are the owner, or have a Can manage role.Click ... and then Edit.
-
You can test a connection.
Click ... and then Test connection.
-
You can delete a connection.
Click ... and then Delete.
Creating a connection
The are several ways of creating connections:
-
Click Connections, on the left and then Create connection.
-
Click Create new in data task setup wizards where you select a connection.
-
Click Create connection in Connections view.
You can filter connectors by:
-
Category
Data warehouse, Cloud storage, Database and Application.
-
Type
Source or Target.
You can also select from recently used connectors.
You will need to select which type of data source, and then enter address and authentication information.
See also: