How to print UTF-8 strings to std::cout on Windows?

后端 未结 7 1213
南旧
南旧 2020-12-08 07:11

I\'m writing a cross-platform application in C++. All strings are UTF-8-encoded internally. Consider the following simplified code:

#include 
#         


        
7条回答
  •  抹茶落季
    2020-12-08 07:55

    The problem is not std::cout but the windows console. Using C-stdio you will get the ü with fputs( "\xc3\xbc", stdout ); after setting the UTF-8 codepage (either using SetConsoleOutputCP or chcp) and setting a Unicode supporting font in cmd's settings (Consolas should support over 2000 characters and there are registry hacks to add more capable fonts to cmd).

    If you output one byte after the other with putc('\xc3'); putc('\xbc'); you will get the double tofu as the console gets them interpreted separately as illegal characters. This is probably what the C++ streams do.

    See UTF-8 output on Windows console for a lenghty discussion.

    For my own project, I finally implemented a std::stringbuf doing the conversion to Windows-1252. I you really need full Unicode output, this will not really help you, however.

    An alternative approach would be overwriting cout's streambuf, using fputs for the actual output:

    #include 
    #include 
    
    #include 
    
    class MBuf: public std::stringbuf {
    public:
        int sync() {
            fputs( str().c_str(), stdout );
            str( "" );
            return 0;
        }
    };
    
    int main() {
        SetConsoleOutputCP( CP_UTF8 );
        setvbuf( stdout, nullptr, _IONBF, 0 );
        MBuf buf;
        std::cout.rdbuf( &buf );
        std::cout << u8"Greek: αβγδ\n" << std::flush;
    }
    

    I turned off output buffering here to prevent it to interfere with unfinished UTF-8 byte sequences.

提交回复
热议问题