Selecting the salary records above the average using a Map/Reduce Job
This scenario applies only to Talend products with Big Data.
For more technologies supported by Talend, see Talend components.
In this scenario, a six-component Job is created to calculate the average salary of a set of sample data and select the salaries above the average.
The sample data to be used is already stored in the HDFS system to be used and read as
follows:
1 Lyndon 1200
2 Ronald 3500
3 Ulysses 5000
4 Harry 2000
5 Garfield 1800
6 James 3300
7 Chester 4200
8 Dwight 2200
9 Jimmy 2800
10 Herbert 3500
You can read that the separator between the fields is /t and the three columns of the sample data are id, name and salary.
You can use the tHDFSOutput component to write the sample data in the HDFS system to be used. For further information, see tHDFSOutput.