Is there any reason to prefer UTF-16 over UTF-8?

后端 未结 7 1660
野性不改
野性不改 2020-12-25 11:39

Examining the attributes of UTF-16 and UTF-8, I can\'t find any reason to prefer UTF-16.

However, checking out Java and C#, it looks like strings and chars there def

7条回答
  •  余生分开走
    2020-12-25 12:30

    It depends on the expected character sets. If you expect heavy use of Unicode code points outside of the 7-bit ASCII range then you might find that UTF-16 will be more compact than UTF-8, since some UTF-8 sequences are more than two bytes long.

    Also, for efficiency reasons, Java and C# does not take surrogate pairs into account when indexing strings. This would break down completely when using code points that are represented with UTF-8 sequences that take up an odd number of bytes.

提交回复
热议问题