Hashing fields to compare data safely
Before you begin
-
You have previously created a connection to the system storing your source data.
Here, an Amazon S3 connection.
-
You have previously added the dataset holding your source data.
Download the file: string-crops.csv. It contains a dataset with data about harvested crops in Mali with crop types, value of production, harvested areas, etc.
-
You also have created the connection and the related dataset that will hold the processed data.
Here, a dataset stored in the same S3 bucket.
Procedure
Results
Your pipeline is being executed, the data is hashed, identical fields have been merged and reorganized according to the conditions you have stated and the output is sent to the target system you have indicated.