Python Regular Expression example

前端 未结 9 1490
天命终不由人
天命终不由人 2020-12-08 10:04

I want to write a simple regular expression in Python that extracts a number from HTML. The HTML sample is as follows:

Your number is 123
         


        
相关标签:
9条回答
  • 2020-12-08 10:31
    val="Your number is <b>123</b>"
    

    Option : 1

    m=re.search(r'(<.*?>)(\d+)(<.*?>)',val)
    
    m.group(2)
    

    Option : 2

    re.sub(r'([\s\S]+)(<.*?>)(\d+)(<.*?>)',r'\3',val)
    
    0 讨论(0)
  • 2020-12-08 10:35

    Given s = "Your number is <b>123</b>" then:

     import re 
     m = re.search(r"\d+", s)
    

    will work and give you

     m.group()
    '123'
    

    The regular expression looks for 1 or more consecutive digits in your string.

    Note that in this specific case we knew that there would be a numeric sequence, otherwise you would have to test the return value of re.search() to make sure that m contained a valid reference, otherwise m.group() would result in a AttributeError: exception.

    Of course if you are going to process a lot of HTML you want to take a serious look at BeautifulSoup - it's meant for that and much more. The whole idea with BeautifulSoup is to avoid "manual" parsing using string ops or regular expressions.

    0 讨论(0)
  • 2020-12-08 10:35

    The simplest way is just extract digit(number)

    re.search(r"\d+",text)
    
    0 讨论(0)
提交回复
热议问题