Regex for phone numbers

I want to catch the phone number from a text using regex.

Examples:

I have this regex which finds the phone number very well:
^((\(?\+45\)?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$

and it catches all the numbers below well.

But I cannot catch the “tel.”, “tlf”, “mobil:”, etc that could be before the number. And also, if another letter comes after the last digit, it doesn’t take number anymore, but it should.

These examples are not covered:

tel.: +45 09827374, +45 89895867, some kind of text... 
mobil: +45 20802020, +45 20802001,
tlf.: +45 5555 1212 
tlf: +4567890202Girrafe

If helpful, I found this regex:
'\btlf\b\D*([\d\s]+\d)' which can extract the number and the tlf and also stop before it finds a new character which is represented by a letter.

So I tried to combine them and I obtained this but it doesn’t work:
\b(tlf|mobil|telephone|mobile|tel)\b\D*(^((\(?\+45\)?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$)

Expected output:

  • for input: "tel.: +45 09827374, +45 89895867, some kind of text..." –> output: "tel.: +45 09827374" and "+45 89895867"
  • for input: "mobil: +45 20802020, +45 20802001," –> output: "mobil: +45 20802020" and "+45 20802001" or "mobil: +45 20802020, +45 20802001" is ok too
  • for input: "tlf +45 5555 1212" –> output: "tlf +45 5555 1212"
  • for input: "tlf: +4567890202Girrafe" –> output: "tlf: +4567890202"
  • for input: "+4567890202" –> output: "+4567890202"

Can you help me please?

  • 3

    Please edit your question to add the expected outcomes for examples.

    – 

  • Try (?:\b(tlf|mobil|telephone|mobile|tel)\b)?[^\w\n]*((?:\(\+45\)|\+45)?\s*\d{2}\s?\d{2}\s?\d{2}\s?\d{2})(?!\d) regex101.com/r/XiYYkZ/1

    – 




  • @PM77-1 I edited

    – 

  • 1

    @AriadneR. If you don’t want that, you can be more specific about what can be after the tel. like this regex101.com/r/mW6rPV/1 Or as a single match without groups regex101.com/r/Ep0nWs/1

    – 




  • 1

    works like a charm! thank you so much!

    – 

If you want the full match only:

(?:\b(?:tlf|mobile?|tel(?:ephone)?)[.:\s]+)?(?:\(\+45\)|\+45)?\s*\d{2}(?:\s?\d{2}){3}(?!\d)

The pattern matches:

  • (?: Non capture group
    • \b A word boundary to prevent a partial word match
    • (?:tlf|mobile?|tel(?:ephone)?) match one of the alternatives
    • [.:\s]+ match 1+ occurrences of either . : or a whitespace char
  • )? Close the on capture group and make it optional
  • (?:\(\+45\)|\+45)? Optionally match either +45 or (+45)
  • \s*\d{2}(?:\s?\d{2}){3} Match 3 times 2 digits with an optional whitespace char in between
  • (?!\d) Negative lookahead, assert not a digit directly to the right

See a regex demo

You don’t have to use a single regexp, you could match the tel: etc. text first and then just match every phone number, e.g. using GNU awk and POSIX EREs instead of PCREs:

$ awk -v FPAT='[+]45[[:space:]]*[0-9][0-9[:space:]]+[0-9]' '
    match($0,/^(tel|mobil|tlf)\.?:/,a) {
        printf "%s ", a[0]
        for (i=1; i<=NF; i++) {
            print $i
        }
    }
' file
tel.: +45 09827374
+45 89895867
mobil: +45 20802020
+45 20802001
tlf.: +45 5555 1212
tlf: +4567890202

You can do the same with any awk with just a bit more code:

$ awk 'match($0,/^(tel|mobil|tlf)\.?:/) {
    printf "%s ", substr($0,1,RLENGTH)
    while ( match($0,/[+]45[ \t]*[0-9][[0-9 \t]+[0-9]/) ) {
        print substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
}' file
tel.: +45 09827374
+45 89895867
mobil: +45 20802020
+45 20802001
tlf.: +45 5555 1212
tlf: +4567890202

and I’m sure you could do the same in python, perl, ruby or whatever similar tool you like.

IMO it’s better to have a couple of small, simple regexps in your code than one lone, complicated one.

Leave a Comment