find email using regular expression python [duplicate]

Deadly 提交于 2020-08-06 07:11:55

问题


I want to find valid email addresses in a text file, and this is my code:

email = re.findall(r'[a-zA-Z\.-]+@[\w\.-]+',line)

But my code obviously does not contain email addresses where there are numbers before @ sign. And my code could not handle email addresses that do not have valid ending. So could anyone help me with these two problems? Thank you!

An example of my problem would be:

my code can find this email: xyz@gmail.com

but it cannot find this one: xyz123@gmail.com

And it cannot filter this email out either: xyz@gmail


回答1:


From the python re docs, \w matches any alphanumeric character and underscores, equivalent to the set [a-zA-Z0-9_]. So [\w\.-] will appropriately match numbers as well as characters.

email = re.findall(r'[\w\.-]+@[\w\.-]+(\.[\w]+)+',line)

This post discusses matching email addresses much more extensively, and there are a couple more pitfalls you run into matching email addresses that your code fails to catch. For example, email addresses cannot be made up entirely of punctuation (...@....). Additionally, there is often a maximum length on addresses, depending on the email server. Also, many email servers match non-english characters. So depending on your needs you may need a more comprehensive pattern.




回答2:


Try the validate_email package.

pip install validate_email

Then

from validate_email import validate_email
is_valid = validate_email('example@example.com')



回答3:


^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$

Not mine, but I have used it in apps before.

Source



来源:https://stackoverflow.com/questions/41798539/find-email-using-regular-expression-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!