Difference between MBCS and UTF-8 on Windows

后端未结

关注

 4  1106

醉话见心 2020-11-28 03:23

I am reading about the charater set and encodings on Windows. I noticed that there are two compiler flags in Visual Studio compiler (for C++) called MBCS and UNICODE. What i

4条回答

不知归路 (楼主)

2020-11-28 04:04
MBCS means Multi-Byte Character Set and describes any character set where a character is encoded into (possibly) more than 1 byte.

The ANSI / ASCII character sets are not multi-byte.

UTF-8, however, is a multi-byte encoding. It encodes any Unicode character as a sequence of 1, 2, 3, or 4 octets (bytes).

However, UTF-8 is only one out of several possible concrete encodings of the Unicode character set. Notably, UTF-16 is another, and happens to be the encoding used by Windows / .NET (IIRC). Here's the difference between UTF-8 and UTF-16:
- UTF-8 encodes any Unicode character as a sequence of 1, 2, 3, or 4 bytes.
- UTF-16 encodes most Unicode characters as 2 bytes, and some as 4 bytes.
It is therefore not correct that Unicode is a 16-bit character encoding. It's rather something like a 21-bit encoding (or even more these days), as it encompasses a character set with code points U+000000 up to U+10FFFF.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...