Replace html entities with the corresponding utf-8 characters in Python 2.6

后端 未结 3 797
长发绾君心
长发绾君心 2020-12-15 23:04

I have a html text like this:

<xml ... >

and I want to convert it to something readable:


         


        
3条回答
  •  盖世英雄少女心
    2020-12-15 23:31

    Python 2.7

    Official documentation for HTMLParser: Python 2.7

    >>> import HTMLParser
    >>> pars = HTMLParser.HTMLParser()
    >>> pars.unescape('© €')
    u'\xa9 \u20ac'
    >>> print _
    © €
    

    Python 3

    Official documentation for HTMLParser: Python 3

    >>> from html.parser import HTMLParser
    >>> pars = HTMLParser()
    >>> pars.unescape('© €')
    © €
    

提交回复
热议问题