What is the difference between UTF-8 and Unicode?

前端 未结 15 1183
独厮守ぢ
独厮守ぢ 2020-11-22 17:08

I have heard conflicting opinions from people - according to the Wikipedia UTF-8 page.

They are the same thing, aren\'t they? Can someone clarify?

15条回答
  •  臣服心动
    2020-11-22 17:14

    The existing answers already explain a lot of details, but here's a very short answer with the most direct explanation and example.

    Unicode is the standard that maps characters to codepoints.
    Each character has a unique codepoint (identification number), which is a number like 9731.

    UTF-8 is an the encoding of the codepoints.
    In order to store all characters on disk (in a file), UTF-8 splits characters into up to 4 octets (8-bit sequences) - bytes. UTF-8 is one of several encodings (methods of representing data). For example, in Unicode, the (decimal) codepoint 9731 represents a snowman (), which consists of 3 bytes in UTF-8: E2 98 83

    Here's a sorted list with some random examples.

提交回复
热议问题