Regex for phone numbers

Question 1

I want to catch the phone number from a text using regex.

Examples:

I have this regex which finds the phone number very well:
^(($?\+45$?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$

and it catches all the numbers below well.

But I cannot catch the “tel.”, “tlf”, “mobil:”, etc that could be before the number. And also, if another letter comes after the last digit, it doesn’t take number anymore, but it should.

These examples are not covered:

tel.: +45 09827374, +45 89895867, some kind of text... 
mobil: +45 20802020, +45 20802001,
tlf.: +45 5555 1212 
tlf: +4567890202Girrafe

If helpful, I found this regex:
'\btlf\b\D*([\d\s]+\d)' which can extract the number and the tlf and also stop before it finds a new character which is represented by a letter.

So I tried to combine them and I obtained this but it doesn’t work:
\b(tlf|mobil|telephone|mobile|tel)\b\D*(^(($?\+45$?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$)

Expected output:

for input: "tel.: +45 09827374, +45 89895867, some kind of text..." –> output: "tel.: +45 09827374" and "+45 89895867"
for input: "mobil: +45 20802020, +45 20802001," –> output: "mobil: +45 20802020" and "+45 20802001" or "mobil: +45 20802020, +45 20802001" is ok too
for input: "tlf +45 5555 1212" –> output: "tlf +45 5555 1212"
for input: "tlf: +4567890202Girrafe" –> output: "tlf: +4567890202"
for input: "+4567890202" –> output: "+4567890202"

Can you help me please?

Question 2

If you want the full match only:

(?:\b(?:tlf|mobile?|tel(?:ephone)?)[.:\s]+)?(?:\(\+45\)|\+45)?\s*\d{2}(?:\s?\d{2}){3}(?!\d)

The pattern matches:

(?: Non capture group
- \b A word boundary to prevent a partial word match
- (?:tlf|mobile?|tel(?:ephone)?) match one of the alternatives
- [.:\s]+ match 1+ occurrences of either . : or a whitespace char
)? Close the on capture group and make it optional
(?:$\+45$|\+45)? Optionally match either +45 or (+45)
\s*\d{2}(?:\s?\d{2}){3} Match 3 times 2 digits with an optional whitespace char in between
(?!\d) Negative lookahead, assert not a digit directly to the right

See a regex demo

Question 3

You don’t have to use a single regexp, you could match the tel: etc. text first and then just match every phone number, e.g. using GNU awk and POSIX EREs instead of PCREs:

$ awk -v FPAT='[+]45[[:space:]]*[0-9][0-9[:space:]]+[0-9]' '
    match($0,/^(tel|mobil|tlf)\.?:/,a) {
        printf "%s ", a[0]
        for (i=1; i<=NF; i++) {
            print $i
        }
    }
' file
tel.: +45 09827374
+45 89895867
mobil: +45 20802020
+45 20802001
tlf.: +45 5555 1212
tlf: +4567890202

You can do the same with any awk with just a bit more code:

$ awk 'match($0,/^(tel|mobil|tlf)\.?:/) {
    printf "%s ", substr($0,1,RLENGTH)
    while ( match($0,/[+]45[ \t]*[0-9][[0-9 \t]+[0-9]/) ) {
        print substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
}' file
tel.: +45 09827374
+45 89895867
mobil: +45 20802020
+45 20802001
tlf.: +45 5555 1212
tlf: +4567890202

and I’m sure you could do the same in python, perl, ruby or whatever similar tool you like.

IMO it’s better to have a couple of small, simple regexps in your code than one lone, complicated one.

Leave a Comment Cancel reply