Extracting phone number information
You can use the Extract phone number information function to extract new types of information about phone numbers into several new columns.
This function is able to extract information about the phone type, the country, the region, the geographic area, the carrier name and the timezone. However, the behaviour of the function depends on the semantic type of the column containing the phone number data:
- If the semantic type corresponds to either US Phone, UK Phone, DE Phone or FR Phone, you can simply select which fields you want to output and apply the function.
- If the column contains numbers from different countries, with different formats, and the matching semantic type is the more generic Phone number, you will need to do some formatting before being able to use the Extract phone number information function. This is a necessary step because numbers that are not standardized often have a structure that corresponds to several country, making it impossible to uniquely determine the country.
Let's take the example of a dataset containing basic customer information, such as names countries and phone numbers from clients all over the world. Your goal with this preparation is to work on phone numbers to only keep customers who gave their mobile phone number as contact information. The Extract phone number information could display this information about the phone type, but because the numbers are in various formats, you cannot apply the function just yet. You are first going to perform a formatting operation on the phone column, using the information of the country column, to add an international prefix to your numbers. Talend Data Preparation will then be able to extract the information of your phone numbers, that are in an harmonized format, and that also contain an information about their respective countries.
Procedure
Results
After a quick formatting step, the columns containing the various information extracted from the phone numbers, have been created. The information is extracted by the Google phone library. You can now easily identify which numbers are from a fixed line or from a mobile and continue your preparation.
Rows that were empty or invalid, will generate empty cells after the function has been applied.