Skip to main content Skip to complementary content

Creating a Job to divide the input text into tokens in CoNLL format

This Job uses tNLPPreprocessing to divide a text sample in XML format into tokens. Then, tokens are converted to the CoNLL format using tNormalize.

Procedure

  1. Drop the following components from the Palette onto the design workspace: tXMLFileInput, tNLPPreprocessing, tFilterColumns, tNormalize and tFileOutputDelimited.
  2. Connect the components using Row > Main connections.

Results

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!