Data Masking effects
Text and semantic types
For textual data, Talend Data Preparation automatically suggests either one of the predefined semantic types, one of your custom semantic types, or the Text type. In the case of the predefined and custom semantic types, they can be based either on a regular expression, or a dictionary of values.
The following table lists the available masking routines for a column with the Text type, or any of the predefined or custom semantic types, and their effects on the value Talend in 2018 is awesome for example.
Masking routine | Description | Parameters | Output |
---|---|---|---|
Semantic masking |
|
Masking mode: Random or Repeatable | Äåòçôî ëð 1889 òn äipïåvu |
Keep characters between two positions | All the characters included in the selected interval remain as is, while the ones outside the interval are deleted. |
|
2018 is awesome |
Generate from Char Pattern | A records with random characters will be created from the pattern of your choice. |
|
õaßayè 8908 æluäco |
Remove characters between two positions | All the characters included in the selected interval are removed, while the ones outside the interval remain as is. |
|
Talend is awesome |
Replace all | All the characters are replaced with the substitute of your choice. |
|
xxxxxxxxxxxxxxxxxxxxxxxxx |
Replace all digits | All the digits are replaced with the substitute of your choice. Letters are kept as is. |
|
Talend in 9999 is awesome |
Replace all letters | All the letters are replaced with the substitute of your choice. Digits are kept as is. |
|
yyyyyy yy 2018 yy yyyyyyy |
Replace characters between two positions | All the characters included in the selected interval are replaced, while the ones outside the interval remain as is. |
|
aaaaaa in 2018 is awesome |
Replace first n characters | Replaces the first n characters with the substitute of your choice, while the following ones remain as is. |
|
@@@@@@@@@@@@@@@@@ awesome |
Replace last n characters | Replaces the last n characters with the substitute of your choice, while the previous ones remain as is. |
|
Talend in 2018 !!!!!!!!!! |
Keep first n digits and replace following ones | Keep the first n digits as is and replaces subsequent ones with random digits. Non-digits characters remain as is. |
|
Talend in 2436 is awesome |
Keep last n digits and replace previous ones | Keep the last n digits as is and replaces previous ones with random digits. Non-digits characters remain as is. |
|
Talend in 1618 is awesome |
Numeric values
The following table lists the available masking routines for a column containing numeric values, with the Integer or Decimal type, and their effect on the value 21803 for example.
Masking routine | Parameters | Output |
---|---|---|
Replace with random value |
|
21499 |
Generate value between two values |
|
21876 |
Dates
The following table lists the available masking routines for a column with the Date semantic type, and their effects on the value 05/04/2018 for example.
Masking routine | Parameters | Output |
---|---|---|
Replace with random date |
|
23/11/2017 |
Keep year and set day and month to 01/01 | N/A | 01/01/2018 |