How to best deal with Windows' 16-bit wchar_t ugliness?

点点圈 提交于 2019-11-29 12:23:35

I'd do something like #4, but don't generate any output until you're sure the input is valid.

  • mbrtowc should decode the entire character. If it's outside the BMP, then output the high surrogate and store the low surrogate in the mbstate_t.
  • wcrtomb should store high surrogates in the mbstate_t, then output all 4 UTF-8 bytes if the character is valid.

If you are on windows, you convert between UTF-16 and UTF-8 a whole string at a time using MultiByteToWideChar and WideCharToMultiByte.

While the default mode in GCC is a 32bit wchar_t there are compile switches that change that, and more generally the c & c++ specs don't specify the size of wchar_t - in fact wchar_t can be the same size as char.

If you want to avoid using Windows APIs (in your windows wrapper code!?) then use mbstowcs to convert an entire string at a time.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!