c++ WriteFile unicode characters

徘徊边缘 提交于 2020-08-26 07:50:09

问题


I'm trying to write a wstring into a UTF-8 file using WriteFile function. I want the file to have these characters "ÑÁ" but I'm getting this "�".

Here is the code

#include <iostream>
#include <cstdlib>
#include <sstream>
#include <string>
#include <fstream>
#include <windows.h>
#include <wchar.h>
#include <stdio.h>
#include <winbase.h>
using namespace std;

const char filepath [] = "unicode.txt";

int main ()
{ 
   wstring str;
   str.append(L"ÑÁ");
   wchar_t* wfilepath;

   // Create a file to work with Unicode and UTF-8
   ofstream fs;
   fs.open(filepath, ios::out|ios::binary);
   unsigned char smarker[3];
   smarker[0] = 0xEF;
   smarker[1] = 0xBB;
   smarker[2] = 0xBF;
   fs << smarker;
   fs.close();

   //Open and write in the file with windows functions
   mbstowcs(wfilepath, filepath, strlen(filepath));
   HANDLE hfile;
   hfile = CreateFileW(TEXT(wfilepath), GENERIC_WRITE, 0, NULL,
       OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
   wstringbuf strBuf (str, ios_base::out|ios::app);
   DWORD bytesWritten;
   DWORD dwBytesToWrite = (DWORD) strBuf.in_avail();
   WriteFile(hfile, &strBuf, dwBytesToWrite, &bytesWritten, NULL);
   CloseHandle(hfile);
}

I compile it on cygwin using this command line:
g++ -std=c++11 -g Windows.C -o Windows


回答1:


You need to convert the UTF-16 data to UTF-8 before writing it to the file.

And there is no need to create the file with std::ofstream, close it, and re-open it with CreateFileW(). Just open the file once and write everything you need.

Try this:

#include <iostream>
#include <cstdlib>
#include <string>
//#include <codecvt>
//#include <locale>

#include <windows.h>
#include <wchar.h>
#include <stdio.h>

using namespace std;

LPCWSTR filepath = L"unicode.txt";

string to_utf8(const wstring &s)
{
    /*
    wstring_convert<codecvt_utf8_utf16<wchar_t>> utf16conv;
    return utf16conv.to_bytes(s);
    */

    string utf8;
    int len = WideCharToMultiByte(CP_UTF8, 0, s.c_str(), s.length(), NULL, 0, NULL, NULL);
    if (len > 0)
    {
        utf8.resize(len);
        WideCharToMultiByte(CP_UTF8, 0, s.c_str(), s.length(), &utf8[0], len, NULL, NULL);
    }
    return utf8;
}

int main ()
{ 
    wstring str = L"ÑÁ";

    // Create a UTF-8 file and write in it using Windows functions
    HANDLE hfile = CreateFileW(filepath, GENERIC_WRITE, 0, NULL,
       CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
    if (hfile != INVALID_HANDLE_VALUE)
    {
        unsigned char smarker[3];
        DWORD bytesWritten;

        smarker[0] = 0xEF;
        smarker[1] = 0xBB;
        smarker[2] = 0xBF;
        WriteFile(hfile, smarker, 3, &bytesWritten, NULL);

        string strBuf = to_utf8(str);
        WriteFile(hfile, strBuf.c_str(), strBuf.size(), &bytesWritten, NULL);

        CloseHandle(hfile);
    }

    return 0;
}



回答2:


Problem is here:

wstringbuf strBuf (str, ios_base::out|ios::app);
WriteFile(hfile, &strBuf, dwBytesToWrite, &bytesWritten, NULL);

&strBuf is the address of the wstringbuf object, which contains things like a pointer to the content, buffer position, and status flags... not where its contents are located.

You probably wanted

WriteFile(hfile, &str[0], /* etc */

but that will just store the same encoding that your wstring uses. To write in UTF-8, you may want to use WideCharToMultiByte (or wcstombs, since you already used mbstowcs).




回答3:


Ben is correct that you're writing raw wchar_t to the file, not UTF-8.

To write UTF-8, you might consider staying inside of C++ and doing this:

std::locale loc (std::locale(), new std::codecvt_utf8<wchar_t>);

std::wofstream fs ("unicode.txt");
fs.imbue(loc);

fs << L"ÑÁ"; 


来源:https://stackoverflow.com/questions/28618715/c-writefile-unicode-characters

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!