Skip to main content

Character-based patterns

Talend Cloud Data Stewardship conducts a character-based pattern profiling and computes the character patterns repartition in the data you load in any of the campaigns.

Latin characters, as well as Asian characters, split between Hiragana, Katakana, Kanji and Hangul, are represented by the following patterns:

Character Pattern
Latin numbers 9 replaces all ASCII digits
Latin lowercase letters a replaces all ASCII Latin characters
Latin uppercase letters A replaces all uppercase Latin characters
Hiragana H replaces all Hiragana characters
Katakana K replaces all Katakana characters
Kanji C replaces Chinese characters
Hangul G replaces Hangul characters
Katakana K replaces all Katakana characters

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!