问题
I'm trying to print the Chinese character 中
using the types wchar_t
, char16_t
and char32_t
, without success (live example)
#include <iostream>
int main()
{
char x[] = "中"; // Chinese character with unicode point U+4E2D
char y[] = u8"中";
wchar_t z = L'中';
char16_t b = u'\u4e2d';
char32_t a = U'\U00004e2d';
std::cout << x << '\n'; // Ok
std::cout << y << '\n'; // Ok
std::wcout << z << '\n'; // ??
std::cout << a << '\n'; // prints the decimal number (20013) corresponding to the unicode point U+4E2D
std::cout << b << '\n'; // " " "
}
回答1:
Since you're running your test on a Linux system, source code is UTF-8, which is why x
and y
are the same thing. Those bytes are shunted, unmodified, into the standard output by std::cout << x
and std::cout << y
, and when you view the web page (or when you look at the linux terminal), you see the character as you expected.
std::wcout << z
will print if you do two things:
std::ios::sync_with_stdio(false);
std::wcout.imbue(std::locale("en_US.utf8"));
without unsynching from C, GNU libstdc++ goes through C IO streams, which can never print a wide char after printing a narrow char on the same stream. LLVM libc++ appears to work even synched, but of course still needs the imbue to tell the stream how to convert the wide chars to the bytes it sends into the standard output.
To print b
and a
, you will have to convert them to wide or narrow; even with wbuffer_convert
setting up a char32_t stream is a lot of work. It would look like this:
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv32;
std::cout << conv32.to_bytes(a) << '\n';
Putting it all together: http://coliru.stacked-crooked.com/a/a809c38e21cc1743
来源:https://stackoverflow.com/questions/31571430/im-trying-to-print-a-chinese-character-using-the-types-wchar-t-char16-t-and-ch