Phone Number Regular Expression (Regex) in Python

霸气de小男生 提交于 2019-12-02 05:31:30

问题


Dive into python gives an amazing little tutorial on creating a regular expression for phone numbers: http://diveintopython3.ep.io/regular-expressions.html#phonenumbers

The final version comes out to look like:

phone_re = re.compile(r'(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$', re.VERBOSE)

This works fine for almost all examples I can come up with, however I found a pretty big failure that I can't seem to fix.

If a group of 3 digits comes before the phone number it works fine. IE: "500 dollars off, call 123-456-7891"

If a group of 3 digits comes after the phone number it fails. IE: "Call 123-456-7891 for a discount of up to 500"

Any ideas on a fix that would work for both examples?


回答1:


The (\d*)$ requires that the string you're matching against end with digit characters (the $ signifies "end of line"). Try removing the $ if you're matching against a larger string where the phone number may not be at the end of the line.




回答2:


Here's your original, with some spaces (use re.VERBOSE, or remove the spaces):

(\d{3}) \D* (\d{3}) \D* (\d{4}) \D* (\d*)

The \D* will match anything that's not a digit, including words. Maybe you should try this:

(\d{3}) \W* (\d{3}) \W* (\d{4}) \W* (\d*)

The \W* matches anything that's not a word. It will match (222) - 222 - 2222. However, it will not match if there is a letter between the numbers, as in (222) x 222 - 2222. The last part of the match (\d*) appears to be looking for an extension. These can be formatted in a variety of ways—I suggest you either drop it or refine it based on how you expect your data to look. And, like Amber says, you should probably drop the $.



来源:https://stackoverflow.com/questions/3484721/phone-number-regular-expression-regex-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!