问题
I am very new to regex , Using python re i am looking to extract phone numbers from the following multi-line string text below :
Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
<p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
<p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
<h2>Where we are </h2>
<strong> Call us on:</strong> +6 (03) 8924 8686
</p></div><div class="sys_two">
<h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
<strong> Call us on:</strong> +6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
Fax:<br />
+60 (7) 228-6202<br />
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""
So when i compile the pattern , i should be able to find using
phone = re.findall(pattern,source,re.DOTALL)
['+60 (0)3 2723 7900',
'+60 (0)3 2723 7900',
'+ 60 (0)4 255 9000',
'+6 (03) 8924 8686',
'+6 (03) 8924 8000',
'+ 60 (7) 268-6200',
'+60 (7) 228-6202',
'+601-4228-8055']
Please help me identify the right pattern
回答1:
Using re module.
>>> import re
>>> Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
<p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
<p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
<h2>Where we are </h2>
<strong> Call us on:</strong> +6 (03) 8924 8686
</p></div><div class="sys_two">
<h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
<strong> Call us on:</strong> +6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
Fax:<br />
+60 (7) 228-6202<br />
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""
>>> for i in re.findall(r'\+[-()\s\d]+?(?=\s*[+<])', Source):
print i
+60 (0)3 2723 7900
+60 (0)3 2723 7900
+ 60 (0)4 255 9000
+6 (03) 8924 8686
+6 (03) 8924 8000
+ 60 (7) 268-6200
+60 (7) 228-6202
+601-4228-8055
>>>
回答2:
This should find all the phone numbers in a given string
re.findall(r'+?(?[1-9][0-9 .-()]{8,}[0-9]', Source)
>>> re.findall(r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]', Source)
['+60 (0)3 2723 7900', '+60 (0)3 2723 7900', '60 (0)4 255 9000', '+6 (03) 8924 8686', '+6 (03) 8924 8000', '60 (7) 268-6200', '+60 (7) 228-6202', '+601-4228-8055']
Basically, the regex lays out these rules
- The matched string may start with + or ( symbol
- It has to be followed by a number between 1-9
- It has to end with a number between 0-9
- It may contain 0-9 (space) .-() in the middle.
回答3:
I extract the mobile number from string using the below regular expression.
import re
sent="this is my mobile number 9999922118"
phone = re.search(r'\b[789]\d{9}\b', sent, flags=0)
if phone:
phone.group(0)
来源:https://stackoverflow.com/questions/37393480/python-regex-to-extract-phone-numbers-from-string