Extract string from HTML String

后端 未结 5 732
名媛妹妹
名媛妹妹 2021-01-26 04:26

i want to extract a number from a html string (i usually do not know the number).

The crucial part looks like this:



        
5条回答
  •  醉酒成梦
    2021-01-26 05:17

    If the string "TOTAL : number" is unique then use a regular expression to first search this substring and then extract the number from it.

    import re
    
    string = 'test test="3" test="search_summary_figure WHR WVM">TOTAL : 286'
    
    reg__expr = r'TOTAL\s:\s\d+'  # TOTAL:
    # find the substring
    result = re.findall(reg__expr, string)
    if result:
    
       substring = result[0]
    
       reg__expr = r'\d+'  # 
       result = re.findall(reg__expr, substring)
       number = int(result[0])
    
       print(number)
    

    You can test your own regular expressions here https://regex101.com/

提交回复
热议问题