Windows Unicode C++ Stream Output Failure

我与影子孤独终老i 提交于 2019-11-27 08:42:51

To write into a file, you have to set the locale correctly, for example if you want to write them as UTF-8 characters, you have to add

const std::locale utf8_locale
            = std::locale(std::locale(), new std::codecvt_utf8<wchar_t>());
test_file.imbue(utf8_locale);

You have to add these 2 include files

#include <codecvt>
#include <locale>

To write to the console you have to set the console in the correct mode (this is windows specific) by adding

_setmode(_fileno(stdout), _O_U8TEXT);

(in case you want to use UTF-8).

For this you have to add these 2 include files:

#include <fcntl.h>
#include <io.h>

Furthermore you have to make sure that your are using a font that supports Unicode (such as for example Lucida Console). You can change the font in the properties of your console window.

The complete program now looks like this:

#include <fstream>
#include <iostream>
#include <codecvt>
#include <locale>
#include <fcntl.h>
#include <io.h>

int main()
{

  const std::locale utf8_locale = std::locale(std::locale(),
                                    new std::codecvt_utf8<wchar_t>());
  {
    std::wofstream test_file("c:\\temp\\test.txt");
    test_file.imbue(utf8_locale);
    test_file << L"\u2122";
  }

  _setmode(_fileno(stdout), _O_U8TEXT);
  std::wcout << L"\u2122";
}

Are you always using std::wcout or are you sometimes using std::cout? Mixing these won't work. Of course, the error description "choking" doesn't say at all what problem you are observing. I'd suspect that this is a different problem to the one using files, however.

As there is no real description of the problem it takes somewhat of a crystal ball followed by a shot in the dark to hit the problem... Since you want to get Unicode characters from you file make sure that the file stream you are using uses a std::locale whose std::codecvt<...> facet actually converts to a suitable Unicode encoding.

I just tested GCC (versions 4.4 thru 4.7) and MSVC 10, which all exhibit this problem.

Equally broken is wprintf, which does as little as the C++ stream API.

I also tested the raw Win32 API to see if nothing else was causing the failure, and this works:

#include <windows.h>
int main()
{ 
    HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);
    DWORD n;
    WriteConsoleW( stdout, L"\u03B2", 1, &n, NULL );
}

Which writes β to the console (if you set cmd's font to something like Lucida Console).

Conclusion: wchar_t output is horribly broken in both large C++ Standard library implementations.

Although the wide character streams take Unicode as input, that's not what they produce as output - the characters go through a conversion. If a character can't be represented in the encoding that it's converting to, the output fails.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!