How many characters can be mapped with Unicode?

后端未结

关注

 6  766

感情败类 2020-11-27 02:48

I am asking for the count of all the possible valid combinations in Unicode with explanation. I know a char can be encoded as 1,2,3 or 4 bytes. I also don\'t understand why

6条回答

北荒 (楼主)

2020-11-27 03:10

Unicode allows for 17 planes, each of 65,536 possible characters (or 'code points'). This gives a total of 1,114,112 possible characters. At present, only about 10% of this space has been allocated.

The precise details of how these code points are encoded differ with the encoding, but your question makes it sound like you are thinking of UTF-8. The reason for restrictions on the continuation bytes are presumably so it is easy to find the beginning of the next character (as continuation characters are always of the form 10xxxxxx, but the starting byte can never be of this form).

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...