Skip to main content Skip to complementary content

Unmasking Australian phone numbers

The Job in this scenario uses the tPatternUnmasking component to retrieve the original Australian phone numbers masked with the tPatternMasking component.

The original Australian phone numbers are 02 5550 8328, 08 5550 3018 and 07 5550 5556.

This scenario describes a Job which uses:
  • The tFileInputDelimited component to read a CSV file that contains Australian phone numbers masked with the tPatternMasking component.
  • The tPatternUnmasking component to unmask the input Australian phone numbers.
  • The tPatternUnmasking component to unmask the input Australian phone numbers.
  • The tFileOutputDelimited component to output masked and original phone numbers values.
    A Job using the tFileInputDelimited, tPatternMasking, and tFileOutputDelimited components.

To replicate this scenario, download and extract the masked_phonenumbers.zip file.

This file contains Australian phone numbers masked with the tPatternMasking component, using the FF1 with AES method combined with a user-defined password.

Setting up the Job

Procedure

  1. Drop the following components from the Palette onto the design workspace: tFixedFlowInput, tPatternUnmasking and tFileOutputDelimited.
  2. Connect the three components together using Row > Main links.

Configuring the input component

Procedure

  1. Double-click tFileInputDelimited to open its Basic settings view in the Component tab.
    Configuration of the tFileInputDelimited.
  2. In the File name/Stream field, set the path to the file that contains the encrypted masked phone numbers.

    Example

    In this example, set the path to the masked_phonenumbers.csv file.
  3. Select the CSV options check box.
  4. Click the [...] button next to Edit schema and use the [+] button in the dialog box to add a column of String type.
    Schema of the tFileInputDelimited component.
  5. Click OK in the dialog box and accept to propagate the changes when prompted.
  6. In the Header field, enter 1.

Configuring the unmasking operations

Configure one unmasking operation for each part of the input phone numbers. Separators will be left unchanged in the unmasked values.

In the Modifications table, the settings must be the same as the ones used for the masking operations performed by the tPatternMasking component.

About this task

The masked Australian phone numbers use the XX XXXX XXXX format:
  • A two-digit area code
  • A space used as a separator
  • A first four-digit line number
  • A space used as a separator
  • A second four-digit line number

Procedure

  1. Double-click tPatternUnmasking to display its Basic settings view in the Component tab.
    Configuration of the tPatternUnmasking component.
  2. If required, click Sync columns to retrieve the schema defined in the input component.
  3. Click the Edit schema button to open the schema dialog box.

    tPatternUnmasking adds a read-only column to the output schema.

    Examples of input and output schemas.

    The ORIGINAL_MARK column labels output records:

    • Original records are labeled with the true label.
    • Substitute records are labeled with the false label.
  4. In the Modifications table, click the [+] button to add three rows.
    Each row corresponds to an unmasking operation for a part of the input phone numbers.
  5. In the Modifications table, edit the first row to configure the unmasking operation for prefixes:
    1. From the Column to unmask field, select the column which holds the data to be unmasked.
      In this example, select PhoneNumber.
    2. From the Field type field, select Enumeration as the field type the data belongs to and enter "02,03,07,08" in the Values field.
  6. In the Modifications table, edit the second row to unmask the first four-digit line numbers:
    1. From the Column to unmask field, select the column which holds the data to be unmasked.
      In this example, select PhoneNumber.
    2. From the Field type field, select Interval as the field type the data belongs to and enter "2000,9999" in the Range field.
  7. In the Modifications table, configure the third row to unmask the second four-digit line numbers:
    1. From the Column to unmask field, select the column which holds the data to be unmasked.
      In this example, select PhoneNumber.
    2. From the Field type field, select Interval as the field type the data belongs to and enter "0000,99999" in the Range field.
  8. Click the Advanced settings tab and select the Output the original row? check box.
    The Job will output original and substitute records.
  9. From the Method list, select the method used when the data was masked using the tPatternMasking component.

    Example

    In this example, select FF1 with AES.

    When you use a FF1 method, the number of possible values that the component can generate from the input pattern must be greater than or equal to 1,000,000.

  10. In the Password or 256-bit key for FF1 methods field, enter the user-defined password used when the data was masked using the tPatternMasking component.

    Example

    In this example, enter "talend".

Configuring the output component and executing the Job

Procedure

  1. Double-click the tFileOutputDelimited component to display the Basic settings view and define the component properties.
  2. In the File Name field, set the path to the file that will contain the unmasked values.
  3. Press F6 to save and execute the Job.

Results

PhoneNumber;ORIGINAL_MARK
02 5550 8328;false
08 5550 3018;false
07 5550 5556;false

Given that a format-encryption method and a password were used to bijectively mask phone numbers, the component retrieved back the original phone numbers.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!