Text statistics
You can use the text statistics indicators to analyze columns only if their data mining type is set to nominal in the analysis editor. Otherwise, these statistics are grayed out in the Indicator Selection dialog box. For further information on the available data mining types, see Data mining types.
Text statistics analyze the characteristics of textual fields in the columns, including minimum, maximum, and average length.
- Minimal Length: computes the minimal length of a text field. It excludes null and blank values.
- Maximal Length: computes the maximal length of a text field. It excludes null and blank values.
- Average Length: computes the average length of a field. It excludes null and blank values.
Other text indicators are available to count each of the above indicators with null values, with blank values or with null and blank values.
Null values are counted as data of 0 length, that is to say the minimal length of null values is 0. This means that the Other text indicators are available to count each of the above indicators with null values, with blank values or with null and blank values. Minimal Length With Null and the Maximal Length With Null compute the minimal/maximal length of a text field including null values, that are considered to be 0-length text.
Blank values are counted as regular data of 1 length. Empty values are counted as data of 0 length, that is to say the minimal length of blank values is 0. This means that the Minimal Length With Blank and the Maximal Length With Blank compute the minimal/maximal length of a text field including blank values.
The same are applied for all average indicators. Empty values are also counted as data of 0 length.
For example, compute the length of textual fields in a column containing the following values, using all different types of text statistic indicators:
Value | Number of characters |
---|---|
"Brayan" | 6 |
"Ava" | 3 |
"_" | 1 |
"" | 0 |
<null> | <null> |
"__________" | 10 |
The following table shows the indicators that you can select in any database:
Data type | Number | Text | Date | Others | ||||
---|---|---|---|---|---|---|---|---|
Analysis engine type | Java | SQL | Java | SQL | Java | SQL | Java | SQL |
Minimal Length | ||||||||
Minimal Length With Null | ||||||||
Minimal Length With Blank | ||||||||
Minimal Length With Blank And Null | ||||||||
Maximal Length | ||||||||
Maximal Length With Null | ||||||||
Maximal Length With Blank | ||||||||
Maximal Length With Blank And Null | ||||||||
Average Length | ||||||||
Average Length With Null | ||||||||
Average Length With Blank | ||||||||
Average Length With Blank And Null |