Why UTF-32 exists whereas only 21 bits are necessary to encode every character?

前端未结

关注

 5  1672

既然无缘 2020-12-05 07:02

We know that codepoints can be in this interval 0..10FFFF which is less than 2^21. Then why do we need UTF-32 when all codepoints can be represented by 3 bytes? UTF-24 shoul

5条回答

难免孤独 (楼主)

2020-12-05 07:30

UTF-32 is a multiple of 16bit. Working with 32 bit quantities is much more common than working with 24 bit quantities and is usually better supported. It also helps keep each character 4-byte aligned (assuming the entire string is 4-byte aligned). Going from 1 byte to 2 bytes to 4 bytes is the most "logical" procession.

Apart from that: The Unicode standard is ever-growing. Codepoints outside of that range could eventually be assigned (it is somewhat unlikely in the near future, however, due to the huge number of unassigned codepoints still available).

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...