Follow these steps to create the second Job, which will upload the access log
file to the HCatalog:
Procedure
Create a new Job and name it B_HCatalog_Load to identify its role and execution order
among the example Jobs.
From the Palette, drop a tApacheLogInput, a tFilterRow, a tHCatalogOutput, and a tLogRow component onto the design workspace.
Connect the tApacheLogInput component to
the tFilterRow component using a Row > Main
connection, and then connect the tFilterRow
component to the tHCatalogOutput component
using a Row > Filter connection.
This data flow will load the log file to
be analyzed to the HCatalog database, with any records having the error code
of "301" removed.
Connect the tFilterRow component to the
tLogRow component using a Row > Reject
connection.
This flow will print the records with the error code of "301" on
the console.
Label these components to better identify their functionality.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!