Python regular expression for HTML parsing (BeautifulSoup)

前端 未结 7 2306
感情败类
感情败类 2020-11-27 19:21

I want to grab the value of a hidden input field in HTML.


I

7条回答
  •  夕颜
    夕颜 (楼主)
    2020-11-27 19:40

    For this particular case, BeautifulSoup is harder to write than a regex, but it is much more robust... I'm just contributing with the BeautifulSoup example, given that you already know which regexp to use :-)

    from BeautifulSoup import BeautifulSoup
    
    #Or retrieve it from the web, etc. 
    html_data = open('/yourwebsite/page.html','r').read()
    
    #Create the soup object from the HTML data
    soup = BeautifulSoup(html_data)
    fooId = soup.find('input',name='fooId',type='hidden') #Find the proper tag
    value = fooId.attrs[2][1] #The value of the third attribute of the desired tag 
                              #or index it directly via fooId['value']
    

提交回复
热议问题