Do UTF-8, UTF-16, and UTF-32 differ in the number of characters they can store?

后端 未结 6 1895
慢半拍i
慢半拍i 2020-12-01 02:55

Okay. I know this looks like the typical \"Why didn\'t he just Google it or go to www.unicode.org and look it up?\" question, but for such a simple question the ans

6条回答
  •  一生所求
    2020-12-01 03:32

    UTF-8, UTF-16, and UTF-32 all support the full set of unicode code points. There are no characters that are supported by one but not another.

    As for the bonus question "Do these encodings differ in the number of characters they can be extended to support?" Yes and no. The way UTF-8 and UTF-16 are encoded limits the total number of code points they can support to less than 2^32. However, the Unicode Consortium will not add code points to UTF-32 that cannot be represented in UTF-8 or UTF-16. Doing so would violate the spirit of the encoding standards, and make it impossible to guarantee a one-to-one mapping from UTF-32 to UTF-8 (or UTF-16).

提交回复
热议问题