Unicode problems in C++ but not C

后端 未结 3 1889
无人共我
无人共我 2021-01-11 15:13

I\'m trying to write unicode strings to the screen in C++ on Windows. I changed my console font to Lucida Console and I set the output to CP_UTF8 a

3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-11 15:53

    It's more surprising that C implementation does work here than that C++ doesn't. char can contain only one byte (numerical values 0-255) and thus console should show only ASCII characters.

    C must be doing some magic for you here - in fact it guesses that these bytes from outside the ASCII range (which is 0-127) you're providing form an Unicode (probably UTF-8) multi-byte character. C++ just displays each byte of your const char[] array, and since UTF bytes treated separately don't have distinct glyphs in your font, it puts these �. Note that you assign 6 letters and get 12 question marks.

    You can read about UTF-8 and ASCII encoding if you want, but the point is that std::wstring and std::wcout is really the best solution designed to handle larger-than-byte characters.

    (If you're not using Latin characters at all, you don't even save memory when you use char-based solutions such as const char[] and std::string instead of std::wstring. All these Cyrillic codes have to take some space anyway).

提交回复
热议问题