Character-based patterns
Talend Cloud Data Stewardship conducts a
character-based pattern profiling and computes the character patterns repartition in the
data you load in any of the campaigns.
Latin characters, as well as Asian characters, split between Hiragana, Katakana, Kanji and Hangul, are represented by the following patterns:
Character | Pattern |
---|---|
Latin numbers | 9 replaces all ASCII digits |
Latin lowercase letters | a replaces all ASCII Latin characters |
Latin uppercase letters | A replaces all uppercase Latin characters |
Hiragana | H replaces all Hiragana characters |
Katakana | K replaces all Katakana characters |
Kanji | C replaces Chinese characters |
Hangul | G replaces Hangul characters |
Katakana | K replaces all Katakana characters |