Skip to main content Skip to complementary content

Evaluating and generating a classification model

The tNLPModel component reads training data in CoNLL format to evaluate and generate a classification model.

Procedure

  1. Double-click the tNLPModel component to open its Basic settings view and define its properties.
    1. Click the [+] button under the Feature template table to add rows to the table.
    2. Click in the Features column to select the features to be generated.
    3. For each feature, specify the relative position.

      For example -2,-1,0,1,2 means that you use the current token, the preceding two and the following two context tokens as features.

    4. From the NLP Library list, select the same library you used for preprocessing the training text data.
  2. To evaluate the model, select the Run cross validation evaluation check box.
  3. Select the Save the model on file system and the Store model in a single file check boxes to save the model locally in the folder specified in the Folder field.
  4. Optional: Change the logging output level for the execution of the Job to output the best weighted F1-score for each improvement of the model in the Run view:
    1. In the Run view, click the Advanced settings tab.
    2. Select the log4jLevel check box, and select Info from the list.
  5. Press F6 to save and execute the Job.

Results

If you set the log4jLevel value to Info, the best weighted F1-score is output to the console of the Run view for each improvement of the model.

The following items are also output to the console of the Run view:

Category Item
For each class The class name
True Positive: the number of elements that were predicted correctly as elements of this class.
Predicted True: the number of elements that were predicted as elements of this class.
Labeled True: the number of elements belonging this class.
Precision score: this score varies from 0 to 1 and indicates how relevant the elements selected by the classification are to a given class.
Recall score: this score varies from 0 to 1 and indicates how many relevant elements are selected.
F1-score: the harmonic mean of the Precision score and the Recall score.
For the best model The global weighted F1-score

The model file is stored in the specified folder. You can now use the generated model with the tNLPPredict component to predict named entities and label text data automatically.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!