Shift-JIS decoding fails using wifstrem in Visual C++ 2013

可紊 提交于 2019-12-30 05:05:07

问题


I am trying to read a text file encoded in Shift-JIS (cp 932) using std::wifstream, and std::getline. The following code works in VS2010 but fails in VS2013:

std::wifstream in;
in.open("data932.txt");

const std::locale locale(".932");

in.imbue(locale);

std::wstring line1, line2;
std::getline(in, line1);
std::getline(in, line2);
const bool good = in.good();

The file contains several lines, where the first line contains just ASCII characters, and the second is Japanese script. Thus, when this snippet runs, line1 should contain the ASCII line, line2 the Japanese script, and good should be true.

When compiled in VS2010, the result is as expected. But when compiled in VS2013, line1 contains the ASCII line, but line2 is empty, and good is false.

I debugged into the CRT, (as the source is provided with Visual Studio), and found that an internal function called _Mbrtowc (in file xmbtowc.c) was modified between the two versions, and the way they use to detect a lead byte of a double byte character was changed, and the one in VS 2013 fails to detect a lead byte, thus fails to decode the byte stream.

Further debugging revealed a point, where a _Cvtvec object's _Isleadbyte array is initialized (in the function _Getcvt(), in file xwctomb.c), and that initialization produces a wrong result. It seems that it always uses code page 1252, which is the default code page on my system, and not 932 which is set for the stream in use. However, I could not decide if it is by design, and I missed some required steps to get a good result, or this is indeed a bug in the CRT for VS2013.

Unfortunately I don't have VS2012 installed, so I could not test on that version.

Any insights on this topic are welcome!


回答1:


I have found a workaround: if for the creation of the locale I explicitly change the global MBC code page, the locale is initialized correctly, and the lines are read and decoded as expected.

const int oldMbcp = _getmbcp();
_setmbcp(932);
const std::locale locale("Japanese_Japan.932");
_setmbcp(oldMbcp);


来源:https://stackoverflow.com/questions/26618463/shift-jis-decoding-fails-using-wifstrem-in-visual-c-2013

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!