Missing parts on Beautiful Soup results

前端 未结 1 1232
别跟我提以往
别跟我提以往 2020-12-05 12:46

I am trying to retrieve few

tags in the following html code. Here is only the part of it


    

        
相关标签:
1条回答
  • 2020-12-05 13:04

    BeautifulSoup can use different parsers to handle HTML input. The HTML input here is a little broken, and the default HTMLParser parser doesn't handle it very well.

    Use the html5lib parser instead:

    >>> len(BeautifulSoup(r.text, 'html').find('td', attrs={'class': 'eelantext'}).find_all('p'))
    0
    >>> len(BeautifulSoup(r.text, 'lxml').find('td', attrs={'class': 'eelantext'}).find_all('p'))
    0
    >>> len(BeautifulSoup(r.text, 'html5lib').find('td', attrs={'class': 'eelantext'}).find_all('p'))
    22
    
    0 讨论(0)
提交回复
热议问题