Skip to main content Skip to complementary content

Consolidating a list of phone numbers coming from a CRM solution

The CRM Export dataset corresponds to an Excel file that has been exported from a CRM solution.

It contains a list of people with their phone numbers for both regular and mobile phones. As these phone numbers are French, they are 10 digits long; numbers starting with 01, 02, 03, 04, or 05 correspond to landline numbers and numbers starting with 06 correspond to mobile phone numbers.

In order to have a consolidated list of phone numbers to call, you will create a new column listing all the mobile phone numbers: this column will also be filled with the landline number when there is no mobile phone. Also, if the landline and mobile phone numbers have been mixed up, you will correct them.

Whereas in other tools you would use conditions like "if" to perform these actions, with Talend Data Preparation, you will create filters.

Download the file: CRM_export.csv.

Adding a preparation for the CRM Export dataset

Add a preparation to start preparing and cleansing your data.

You can create a preparation from a dataset already available in Talend Cloud Data Preparation or one of your local files. When you add a preparation with the corresponding button, it will be created in the folder in which you are currently working. Furthermore, your preparation will automatically be saved in the preparations list, and all the changes you make are also saved automatically.

Before you begin

You have created the CRM_export dataset in Talend Cloud Data Preparation using the previously downloaded file.

Procedure

  1. From the homepage, click Preparations to open the list of preparations.
  2. Click the Add Preparation button.
  3. From the list of Datasets, select CRM_export to use it as source material.
    Add a preparation window with a list of datasets available to create a preparation, including CRM_export.
    The Preparation name field is automatically filled in but you can change the preparation name.
  4. Click Submit.

Results

Your dataset opens with an empty recipe, and you can start adding preparation steps. All your changes are automatically saved.

Duplicating a column

In order to have a copy of the original data without having to create a new column and manually copy the data, you can simply duplicate a column.

Before working on the data, you will create a new column to receive the consolidated numbers.

Procedure

  1. Click the column you want to duplicate, mobile in this example.
  2. Click the menu icon or right-click the column to open the contextual menu.
  3. Click Duplicate Column.
    The mobile column menu is opened, with the Duplicate column option selected.

    A copy of the column, mobile_copy, is created with the same data as the original.

Results

The mobile column is duplicated and can now receive the consolidated data.

Renaming a column

In order to better identify a column, you can rename it.

You will rename the column previously created and give it a meaningful name.

Procedure

  1. Click the column you want to rename, mobile_copy in this example.
  2. Use the Rename Column option in the in the contextual menu for the column.
    The mobile_copy column menu is opened, with the Rename column option selected.
  3. Enter phone number to use as new name for the selected column and press Enter to apply it.

Results

The mobile_copy column is renamed.

Filling empty cells with data from another column

The white part of the quality bar indicates that the column contains empty records.

You will fill the empty cells from the phone number to use, with landline numbers extracted from the phone column.

Procedure

  1. Select the phone number to use column.

    You can identify columns containing empty records using the quality bar. The white part of it indicates that the column contains empty records. Data that matches the column type is shown in green, while orange shows invalid data that does not match the column type.

  2. In the functions panel, type Fill empty cells with text and click the result to open the options for the associated function.
  3. Configure the function as follows:
    • In the Use with field, select Other column.
    • In the Column field, select phone.
  4. Click the Submit button to apply the function.

Results

The empty cells in the phone number to use column are filled with data from the phone column.

Identifying if your data matches a pattern

If you want to identify whether there are mobile phone numbers among the landline numbers, you can use the Match pattern function.

Because the French mobile phone numbers start with 06, you are going to search for this pattern in the phone column.

Procedure

  1. Select the column on which you want to apply the pattern, phone in this example.
  2. In the functions panel, type Match pattern and click the result to open the options for the associated function.
  3. Configure the function as follows:
    • In the Pattern field, select Other.
    • In the Operator field, select the Starts with operator.
    • In the Manual pattern field, enter 06.
  4. Click the Submit button to apply the function.

    A new column is created, with the value true if the pattern matches and false if it does not

Results

The information is stored in a new column and you can now apply a filter to isolate the mobile phone numbers present in the phone column.

Creating a filter

Creating a filter is a quick way to identify or isolate data.

You will isolate the values in the phone column that actually match the pattern defined in the previous step. This will be possible through the use of a filter on the values that were found as true in the phone_matching column.

Procedure

  1. Select the phone_matching column.
  2. In the data profiling area, on the bottom right of the screen, you can see an horizontal bar chart, displaying the number of occurrences of the true and false values.
    Bar chart showing repartition of true and false values.

    Diagrams are a convenient and quick way to apply filters on your data, for one or several columns at a time.

  3. Click the bar displaying the true values.

    A filter is applied on your data, and only the entries that match the value true are displayed in the grid, as you can see in the filter bar.

    A filter is applied to only show the true values.

Results

The data is filtered on the value true, and you will now work on this small sample of data.

Replacing values with data from another column

You can take the data from any column, and use it to consolidate another.

Now that only the mobile phone numbers are selected, you will use these values to replace the wrong numbers in the phone number to use column.

Procedure

  1. Select a column in which you want to replace values, phone number to use in this example.
  2. In the Functions panel, type Fill cells with value and click the result to open the options for the associated function.
  3. Configure the function as follows:
    • In the Use with field, select Other column.
    • In the Column field, select phone.

    Make sure the Filtered rows option is selected in order to only apply the function on the filtered lines.

  4. Click the Submit button to apply the function.

    The filtered content of the selected column, phone number to use in this example, is replaced with the data from the phone column.

  5. In the filter bar, click the garbage bin icon to clear the filter and display the whole dataset again.

Results

You have used the filtered data to update the information of the consolidated column.

Deleting the column used for matching

If you want to remove a column you do not need, you can delete a column.

Now that you have used the phone_matching for filtering purposes, you can delete it.

Procedure

  1. Click the column you want to delete, phone_matching in this example.
  2. Click the menu icon or right-click the column to open the contextual menu.
  3. Click Delete column.
    The phone_matching column menu is opened, with the Delete column option selected.

Results

The phone_matching column is deleted.

Removing empty records from a column

The grey part of the quality bar indicates that a column contains empty records. You may want to remove the rows containing these empty records.

Column headers with a grey part in the quality bars.

In the quality bar, data that matches the column type is shown in green, while red shows invalid data that does not match the column type.

Procedure

  1. In the top left of the grid, click the menu icon and select Display rows with empty values.

    This action applies a filter on all the empty entries from the dataset.

    A rows with empty values filter is applied on the dataset.
  2. In the functions panel, on the top right of your screen, type Delete these filtered rows and click the result to execute the associated function.

    All the rows containing empty cells are removed from the dataset.

  3. Click the bin icon in the filter bar to clear the filter and display the whole dataset again.

Results

All the rows containing empty records are removed from the dataset and the quality bar under each column is now fully green.

Exporting the consolidated phone numbers

Once your preparation is complete, you may want to export the data you have cleansed.

You have managed to filter and consolidate a list of phone numbers, and you can now export the result of this preparation.

Procedure

  1. Click the Export button.
  2. Choose the file format you want to use when exporting your data:
    • If you choose Local CSV file, select which field delimiter, text enclosure and escape characters to use and enter a name for the file to export.
    • If you choose Local XLSX file, choose a name for the file to export.
    • If you choose Amazon S3, enter your credentials and other information to store your file on Amazon S3.

Results

The data you cleansed using your preparation is exported to a local file.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!