I have a dataset of administrative filings that include short biographies. I am trying to extract people\'s ages by using python and some pattern matching. Some example of s
a simple way to find the age of a person from your sentences will be to extract a number with 2 digits:
import re
sentence = 'Steve Jones, Age: 32,'
print(re.findall(r"\b\d{2}\b", 'Steve Jones, Age: 32,')[0])
# output: 32
if you do not want %
to be at the end of your number and also you want to have a white space in the begening you could do:
sentence = 'Equity awards granted to Mr. Love in 2010 represented 48% of his total compensation'
match = re.findall(r"\b\d{2}(?!%)[^\d]", sentence)
if match:
print(re.findall(r"\b\d{2}(?!%)[^\d]", sentence)[0][:2])
else:
print('no match')
# output: no match
works well also for the previous sentence