decodeURIComponent vs unescape, what is wrong with unescape?

后端 未结 4 1156
醉话见心
醉话见心 2020-11-29 23:32

In answering another question I became aware that my Javascript/DOM knowledge had become a bit out of date in that I am still using escape/unescape

4条回答
  •  醉话见心
    2020-11-30 00:16

    escape operates only on characters in the range 0 to 255 inclusive (ISO-8859-1, which is effectively unicode code points representable with a single byte). (*)

    encodeURIComponent works for all strings javascript can represent (which is the whole range of unicode's basic multilingual plane, i e unicode code points 0 to 1,114,111 or 0x10FFFF that cover almost any human writing system in current use).

    Both functions produce url safe strings that only use code points 0 to 127 inclusive (US-ASCII), which the latter accomplishes by first encoding the string as UTF-8 and then applying the %XX hex encoding familiar from escape, to any code point that would not be url safe.

    This is incidentally why you can make a two-funcall UTF-8 encoder/decoder in javascript without any loops or garbage generation, by combining these primitives to cancel out all but the UTF-8-processing side effects, as the unescape and decodeURIComponent versions do the same in reverse.

    (*) Foot note: Some modern browsers like Google Chrome have been tweaked to produce %uXXXX for the above-255 range of characters escape wasn't originally defined for, but web server support for decoding that encoding is not as well-implemented as decoding the IETF-standardized UTF-8 based encoding.

提交回复
热议问题