Unable to extract date of birth from a given format

孤街浪徒 提交于 2019-12-08 23:16:35
import re    

data="""
Thomas, John - DOB/Sex:    12/23/1955                                     11/15/2014   11:53 AM"
Jacob's Date of birth is 9/15/1963
Name:Annie; DOB:10/30/1970
"""

pattern = re.compile(r'.*?\b(?:DOB|Date of birth)\b.*?(\d{1,2}[/-]\d{1,2}[/-](?:\d\d){1,2})',re.I)

matches=pattern.findall(data)

for match in matches:
    print(match)    

Output:

12/23/1955
9/15/1963
10/30/1970

Explanation:

.*?             : 0 or more anycharacter but newline
\b              : word boundary
(?:             : start non capture group
  DOB           : literally
 |              : OR
  Date of birth : literally
)               : end group
\b              : word boundary
.*?             : 0 or more anycharacter but newline
(               : start group 1
    \d{1,2}     : 1 or 2 digits
    [/-]        : slash or dash
    \d{1,2}     : 1 or 2 digits
    [/-]        : slash or dash
    (?:         : start non capture group
        \d\d    : 2 digits
    ){1,2}      : end group may appear 1 or twice (ie; 2 OR 4 digits)
)               : end capture group 1
import re
string = "DOB/Sex:    12/23/1955            11/15/2014   11:53 AM"
re.findall(r'.*?DOB.*?:\s+([\d/]+)', string)

output:

['12/23/1955']
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!