Formatting Telephone Numbers
I created this simple PowerCenter mapplet to format phone numbers. Informatica Developer (IDQ) may have a different or easier way to do it but I wanted something I could use in PowerCenter without the hassle of maintaining a separate object in IDQ and having to import it into PowerCenter. You can alter the mapplet to format other data like postal codes, SSNs, etc.
Download the mapplet and give it a try.
Mapplet Overview
My approach is a simple one, take the phone number, remove all special characters, then reformat the number with dashes as 999-999-9999. This approach allows me to ignore how the phone number comes into the mapplet, for example partially formatted , not formatted or formatted correctly but not how I need it.
The mapplet's input is the telephone number without a country identifier. In addition to the reformatted phone number, the mapplet also outputs:
- a Pattern (example xxx-xxx-xxx) based on the input number (this could be used for a quick/dirty data profiling)
- the phone number parsed into its parts, area code, central office
- the length of the input phone number
Read further to see detail on the mapplet works.
Telephone Pattern
Reg_Replace(Telephone, '[0-9]' , 'x' )
I like this bit of code because you can use it to build patterns then use the patterns to find outliers. Just output the pattern(s) to a table or spreadsheet and DISTINCT the column to see the quality of your data. You could do the same thing for postal codes, SSNs, ext.
Telephone Number, without formatting, numbers only
REPLACECHR(0, Telephone_in, REPLACECHR(0, Telephone_in, '0123456789', ''), '')
If you want to reformat a telephone number, first remove any special characters from it's current formatting. Here I use two REPLACECHR functions, the inner REPLACECHR strips the numbers from Telephone, the result (which should be any special characters) is used as the character set for the outer REPLACECHR.
Area Code
IIF(LENGTH(Telephone_NbrOnly_v)>7, SUBSTR(Telephone_NbrOnly_v,1,3) ,'')
I use the "numbers only" version of the telephone number to get the area code. If the telephone number is more than 7 numbers, I make the assumption the first 3 characters are the area code. If the telephone number is is 7 characters or less, there is no area code.
I also break out the Telephone number into separate fields, area code, central office, and extension. Then to output the formatted telephone number. I concatenate the telephone number with - (dash).
Decode(TRUE, LENGTH( Telephone_NbrOnly_v)>7, Telephone_AreaCode_v || '-' || Telephone_CentralOffice_v || '-' || Telephone_LineNumber_v, LENGTH( Telephone_NbrOnly_v)=7, Telephone_CentralOffice_v || '-' || Telephone_LineNumber_v, '')
Thank you for your support
or |
Informatica Data Quality online training
Informatica Data Quality training
Informatica idq online training
Informatica idq training
Informatica mdm online training
Informatica mdm training
Informatica message Queue online training
Informatica message Queue training
Informatica power center online training
Informatica power center training
Manual Testing online training
Manual Testing training
Microservices online training
Microservices training
Office 365 online training
Office 365 training
Open stack online training
Open stack training