Why Utf8 is compatible with ascii

后端 未结 3 1357
一向
一向 2021-02-08 04:16

A in UTF-8 is U+0041 LATIN CAPITAL LETTER A. A in ASCII is 065.

How is UTF-8 is backwards-compatible with ASCII?

3条回答
  •  萌比男神i
    2021-02-08 04:33

    ASCII uses only the first 7 bits of an 8 bit byte. So all combinations from 00000000 to 01111111. All 128 bytes in this range are mapped to a specific character.

    UTF-8 keep these exact mappings. The character represented by 01101011 in ASCII is also represented by the same byte in UTF-8. All other characters are encoded in sequences of multiple bytes in which each byte has the highest bit set; i.e. every byte of all non-ASCII characters in UTF-8 is of the form 1xxxxxxx.

提交回复
热议问题