Decode HTML entities in Python string?

后端 未结 6 1005
名媛妹妹
名媛妹妹 2020-11-21 06:18

I\'m parsing some HTML with Beautiful Soup 3, but it contains HTML entities which Beautiful Soup 3 doesn\'t automatically decode for me:

>>> from Be         


        
6条回答
  •  梦毁少年i
    2020-11-21 06:52

    Beautiful Soup 4 allows you to set a formatter to your output

    If you pass in formatter=None, Beautiful Soup will not modify strings at all on output. This is the fastest option, but it may lead to Beautiful Soup generating invalid HTML/XML, as in these examples:

    print(soup.prettify(formatter=None))
    # 
    #  
    #   

    # Il a dit <> #

    # # link_soup = BeautifulSoup('A link') print(link_soup.a.encode(formatter=None)) # A link

提交回复
热议问题