Is there any reason to prefer UTF-16 over UTF-8?

后端 未结 7 1632
野性不改
野性不改 2020-12-25 11:39

Examining the attributes of UTF-16 and UTF-8, I can\'t find any reason to prefer UTF-16.

However, checking out Java and C#, it looks like strings and chars there def

7条回答
  •  粉色の甜心
    2020-12-25 12:09

    East Asian languages typically require less storage in UTF-16 (2 bytes is enough for 99% of East-Asian language characters) than UTF-8 (typically 3 bytes is required).

    Of course, for Western lanagues, UTF-8 is usually smaller (1 byte instead of 2). For mixed files like HTML (where there's a lot of markup) it's much of a muchness.

    Processing of UTF-16 for user-mode applications is slightly easier than processing UTF-8, because surrogate pairs behave in almost the same way that combining characters behave. So UTF-16 can usually be processed as a fixed-size encoding.

提交回复
热议问题