i want to extract a number from a html string (i usually do not know the number).
The crucial part looks like this:
If the string "TOTAL : number" is unique then use a regular expression to first search this substring and then extract the number from it.
import re
string = 'test test="3" test="search_summary_figure WHR WVM">TOTAL : 286'
reg__expr = r'TOTAL\s:\s\d+' # TOTAL:
# find the substring
result = re.findall(reg__expr, string)
if result:
substring = result[0]
reg__expr = r'\d+' #
result = re.findall(reg__expr, substring)
number = int(result[0])
print(number)
You can test your own regular expressions here https://regex101.com/