a tPigLoad, to load the data to be analyzed,
a tPigFilterRow, to remove records with the '404' error from the input flow,
a tPigFilterColumns, to select the columns you want to include in the result data,
a tPigAggregate, to count the number of visits to the website,
a tPigSort, to sort the result data, and
a tPigStoreResult, to save the result to HDFS.
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!