Click Sync columns to retrieve the schema
from its preceding component. This is actually the read-only schema of tKafkaInput, since tWindow does not impact the schema.
Click the [...] button next to Edit
schema to open the schema editor.
Rename the single column of the output schema to hashtag. This column is used to carry the hashtag field extracted from the Tweet JSON data.
Click OK to validate these changes.
From the Read by list, select JsonPath.
From the JSON field list, select the column of
the input schema from which you need to extract fields. In this scenario, it is
payload.
In the Loop Jsonpath query field, enter JSON path
pointing to the element over which extraction is looped. According to the JSON
structure of a Tweet as you can read from the documentation of Twitter, enter
$.entities.hashtags to loop over the
hashtags entity.
In the Mapping table, in which the hashtag column of the output schema has been filled in
automatically, enter the element on which the extraction is performed. In this
example, this is the text attribute of each
hashtags entity. Therefore, enter text within double quotation marks in the Json query column.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!