How can I decode HTML entities?

前端 未结 5 1433
自闭症患者
自闭症患者 2021-02-01 04:43

Here\'s a quick Perl question:

How can I convert HTML special characters like ü or ' to normal ASCII text?

I started

5条回答
  •  天命终不由人
    2021-02-01 05:26

    Note that there are hex-specified characters too. They look like this: é (é).

    Use HTML::Entities' decode_entities to translate the entities into actual characters. To convert that to ASCII requires more work. I've used iconv (perl interface: Text::Iconv) with the transliterate option on with some success in the past. But if you are dealing with a limited set of entities, or you don't actually need it reduced to ASCII equivalents, you may be better off limiting what decode_entities produces or providing it with custom conversion maps. See the HTML::Entities doc.

提交回复
热议问题