Validating and correcting your data with data stewardship

With Data stewardship in Qlik Talend Cloud, you can draw on subject matter experts to validate and correct your data. Use your existing semantic types and validation rules to make sure that the data is consistently formed. This extends automated pipelines with human-in-the-loop remediation from domain expertise. When the data is validated you can re-inject it into the original data source, or to any downstream system.

Available in Qlik Talend Cloud Enterprise.

You create a sprint which is the main body of work for the validation and remediation. The sprint contains information about:

The source data
The data schema to use for validation
The owners of the sprint
The data stewards that are defined
The data storage used for sprint data
Workflow settings

During the sprint, all sprint data is stored in your own cloud data warehouse, and not in Qlik Talend Cloud. Currently, Snowflake is the only supported cloud data warehouse.

You can define the following user roles:

Sprint owner

Sprint owners can validate records that are resolved by data stewards. They can also access records that are resolved and export data.
Data steward

A data steward is assigned records to resolve quality issues.

You create sprints in Data stewardship in the Qlik Talend Data Integration activity center. You can create Resolution sprints that correct and curate data in one or more fields in the dataset that requires validation. This is the workflow:

Creating a resolution sprint

Create a sprint and define the data to validate. You can either populate the sprint with a Talend Studio Job, or import a CSV file with data.

Data stewards are defined to perform the validation. Records can be assigned either manually or automatically.
Working in a resolution sprint

Data stewards validate the data in the assigned records.
Managing resolved records
- If you populated the sprint with a Talend Studio Job, you create a Talend Studio Job to retrieve the validated records and return them to the original data source, or to any other required destination.
- If you populated the sprint with a CSV file, the sprint is concluded by exporting the validated data to a CSV file. You can update the data source with validated data by importing the exported CSV file.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!

Leave your feedback here