Creating a resolution sprint from a CSV file
You can create a resolution sprint from a CSV file containing the data to validate.
Prerequisites
Before you create the sprint, you need:
-
A space to use when creating the sprint.
Sprint owners/creators must have the following permissions in the space: Can manage, Can edit, Can view, Can view data
Data stewards must have the following permissions in the space: Can edit, Can view, Can view data
-
A connection to the Snowflake data warehouse that you want to use to store sprint data. Do not use a data gateway for the connection.
All sprint users must have the following permissions in the space of the connection: Can edit, Can view, Can view data
You can create a connection in Connections in the Qlik Talend Data Integration activity center.
For more information about Snowflake connections, see Snowflake.
Creating a sprint
To create a resolution sprint, click Create sprint in Data stewardship in the Qlik Talend Data Integration activity center.
General sprint settings
-
Name
Add a name for the sprint.
-
Space
Select which space to create the sprint in.
-
Description
Add a description of the sprint.
-
Source for sprint population
Select File.
Import the CSV file containing the data that you want to validate.
Click Next when you are ready to proceed to define the data schema.
Define the data schema
You can now validate the data schema used to validate data and adapt it to your requirements. Data quality indicators are displayed for each column, and possibly invalid data is highlighted. This is based on a sample of the data.
Lock columns
Click ... on a column and select Lock to lock the column for editing in the sprint. The column data will still be visible but cannot be edited by data stewards.
Exclude columns
Click ... on a column and select Exclude to exclude the column from the sprint. The column data will not be visible to data stewards.
Apply a semantic type to a column
The column will use its native data type as default. You can apply a semantic type to the column to assist stewards when validating data.
-
Select the column and click
next to Data type. You can now select a semantic type to apply to the column.
You can also change the name and the description for each column.
Click Next when you are ready to proceed to define the data storage.
Add a validation rule to a column
You can apply validation rules to a column to make it easier to spot invalid data. Invalid data will be highlighted in the column.
-
Select the column and click Apply validation rule. You can either select an existing validation rule or create a new validation rule.
For more information about creating validation rules, see Creating a validation rule.
Connect to data storage
You must connect to the cloud data warehouse that you want to use to store sprint data. Snowflake is currently the only supported data warehouse.
-
Select the connection to the data warehouse.
-
Select which database to use.
-
Select if you want to use an existing database schema, or a new database schema.
If you select New database schema, set the name of the new schema.
-
Set the name of the table to use for resolved sprint data in Table name for resolved records.
Click Next when you are ready to proceed to define roles and other settings for the sprint workflow.
Define roles and settings for the sprint workflow
The last step is to define roles and other settings.
-
Add owners
Add all users that should be owner of the sprint.
-
Add stewards
Add all data stewards for this sprint.
-
Record workflow
You can select if you want to add a second validation step by sprint owners.
Information noteIf a user that is both sprint owner and data steward validates a record, the second validation step is by-passed. -
Record assignation
Select if you want to auto-assign records, or if you want to assign records manually to data stewards.
-
Auto
Records are assigned automatically to data stewards with an even distribution. Records will not be assigned to sprint owners that are not also a data steward.
-
Manual
Records will initially not be assigned to a data steward. Sprint owners and data stewards can assign records from Unassigned.
-
-
Priority
You can set priority for the sprint.
Click Save when you are ready to create the sprint.
The sprint is now created, and the assigned data stewards can start validating data.