C++ writing UTF-8 on Linux

半腔热情 提交于 2019-12-11 11:36:35

问题


I have the following code on Windows written in C++ with Visual Studio:

  FILE* outFile = fopen(outFileName, "a,ccs=UTF-8");
  fwrite(buffer.c_str(), buffer.getLength() * sizeof(wchar_t), 1, outFile);
  std::wstring newLine = L"\n";
  fwrite(newLine.c_str(), sizeof(wchar_t), 1, outFile);
  fclose(outFile);

This correctly writes out the file in UTF-8. When I compile and run the same code on Linux, the file is created, but it is zero length. If I change the fopen command as follows, the file is created and non-zero length, but all non-ASCII characters display as garbage:

  FILE* outFile = fopen(outFileName, "a");

Does ccs=UTF-8 not work on Linux gcc?


回答1:


No, the extensions done on Windows do not work on Linux, OS-X, Android, iOS and everywhere else. The Microsoft just makes those extensions to achieve that you write incompatible code with other platforms.

Convert your wide string to byte string that contains UTF-8, then write the bytes to file like usual. There are lot of ways to do it but most standard-compatible way is perhaps like that:

#include <iostream>
#include <string>
#include <codecvt>
#include <locale>

using Converter = std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t>; 

int main()
{
    std::wstring wide = L"Öö Tiib 😛";
    std::string u8 = Converter{}.to_bytes(wide);
    // note: I just put the bytes out to cout, you want to write to file
    std::cout << std::endl << u8 << std::endl; 
}

Demo is there. It uses g++ 8.1.0 but g++ 4.9.x is also likely fine.

Note that is rare case when anyone needs to use wide strings on Linux, most of code there uses utf8 only.



来源:https://stackoverflow.com/questions/50890630/c-writing-utf-8-on-linux

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!