How can I decode HTML entities?

前端 未结 5 1431
自闭症患者
自闭症患者 2021-02-01 04:43

Here\'s a quick Perl question:

How can I convert HTML special characters like ü or ' to normal ASCII text?

I started

5条回答
  •  我在风中等你
    2021-02-01 05:20

    I use this script. Save it as html2utf.py, and use it ala echo $some_html | html2utf.py.

    #!/usr/bin/env python3
    """
    An alternative for `perl -Mopen=locale -MHTML::Entities -pe '$_ = decode_entities($_)'` (which you can use by `cpanm HTML::Entities`) and `recode html..`.
    """
    
    import fileinput
    import html
    
    for line in fileinput.input():
        print(html.unescape(line.rstrip('\n')))
    

提交回复
热议问题