Improving the Talend Trust Score™ of a dataset using Talend Cloud Data Preparation
Talend Cloud Data Preparation, in combination with Talend Cloud Data Inventory can be used to improve the overall health and quality of your data.
In this example, you are working for a B2B e-commerce company. As a business user, you need to monitor, but also actively help improving the data quality and the overall health of your organization's data. This scenario will show how you can navigate your company's dataset inventory, identify the ones that need to be worked on, and fix different issues in order to improve their quality and their Talend Trust Score™.
Looking at your inventory through the Data Console
Use the Data console for a high level view of all your data.
After logging in to the Talend Cloud platform to start your work, open Talend Cloud Data Inventory to land on the Data Console view, that gives you visibility to all the datasets across the organization.
The Data console gives you instant insight on your data health and how to improve it, thanks to different tiles that each cover specific metrics of your dataset inventory, such as the Talend Trust Score™, data quality, semantic types and more. You can start assessing the overall quality and trust by looking at the Talend Trust Score™ tile.
You can see the total score, a radar chart illustrating the five axis that make up the score, and a chart with overall and axis score over time compared to the acceptable threshold defined beforehand.
Thresholds can be set up for each aspects of the Talend Trust Score™, as well as each tiles to define what is considered good or poor according to your organization standards. Datasets that do not meet the thresholds defined beforehand will be accessible directly from the tile so that you can take appropriate actions if needed.
You will now try to refine your search using filters, to find datasets that tend to bring down the overall Talend Trust Score™.
Using filters to find datasets to fix
Procedure
Results
Looking at the Data quality tile, you notice that the number of valid values across the datasets is also not acceptable.
In conclusion, the root cause for the recent drop in overall Talend Trust Score™ is among these remaining datasets. The next step is to look into the dataset list for more details.
Sharing the dataset to improve with competent users
Procedure
Results
Fixing the issues with Talend Cloud Data Preparation
Procedure
Results
Running the preparation to update the source dataset
But because of the splitting function that you used before, you will have to complete a mapping step to reconcile the schema of the preparation and the schema of the destination dataset coming from the database.
After running the preparation, you will be able to see the impact of the preparation on the different quality indicators.
Procedure
Results
Using Talend Cloud Data Inventory and Talend Cloud Data Preparation has allowed you to monitor the datasets of your whole organization, use different indicators to identify potential errors, and fix them accordingly, to improve the health of your data.