发表新帖

发表新帖

How many characters can UTF-8 encode?

前端未结

关注

 10  1427

一个人的身影 2020-11-28 01:55

If UTF-8 is 8 bits, does it not mean that there can be only maximum of 256 different characters?

The first 128 code points are the same as in ASCII. But it says UTF-

10条回答

孤街浪徒 (楼主)

2020-11-28 02:24

Unicode is firmly married to UTF-8. Unicode specifically supports 2^21 code points (2,097,152 characters) which is exactly the same number of code points supported by UTF-8. Both systems reserve the same 'dead' space and restricted zones for code points etc. ...as of June 2018 the most recent version, Unicode 11.0, contains a repertoire of 137,439 characters

From the unicode standard. Unicode FAQ

The Unicode Standard encodes characters in the range U+0000..U+10FFFF, which amounts to a 21-bit code space.

From the UTF-8 Wikipedia page. UTF-8 Description

Since the restriction of the Unicode code-space to 21-bit values in 2003, UTF-8 is defined to encode code points in one to four bytes, ...

0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

热议问题