ICU C++ Converting Encodings

问题

As I understand it, different locales have different encodings. With ICU I'd like to convert from a UnicodeString to the current locale's encoding, and back. Specifically I'm using Boost's Filesystem library, which in turn uses either Windows' UTF-16, or Linux's UTF-8 encodings.

Is there a way to reliably do this using ICU, or another library?

回答1:

You can use ICU, but you may find iconv() sufficient, which is a lot simpler to set up and operate (and it's part of Posix, and easily available for Windows).

With either library, you have to convert your unicode string to a wide string. In iconv() that target is called WCHAR_T. Once you have a wide char, you can use it directly in Windows.

In Linux, you can either proceed to use wcstombs() to transform the wide character into the system's (and locale's) narrow character multibyte encoding (don't forget setlocale(LC_CTYPE, "");), or, alternatively, if you are sure that you want UTF-8 rather than the system's encoding, you can transform from your original string to UTF-8 directly (also with either library).

Maybe you'll find this post of mine to provide some background.