Character-based patterns
Talend Data Preparation allows you to
analyze the character-based patterns repartition in your data.
Latin characters, as well as Asian characters, split between Hiragana, Katakana, Kanji and Hangul, are represented by the following patterns:
Character | Pattern |
---|---|
Latin numbers | 9 replaces all ASCII digits |
Latin lowercase letters | a replaces all ASCII Latin characters |
Latin uppercase letters | A replaces all uppercase Latin characters |
Hiragana | H replaces all Hiragana characters |
Katakana | K replaces all Katakana characters |
Kanji | C replaces Chinese characters |
Hangul | G replaces Hangul characters |
Katakana | K replaces all Katakana characters |