Is there an easy way to write UTF-8 octets in Visual Studio?

前端 未结 2 1609
北荒
北荒 2020-12-19 04:23

I have a problem, I need to use UTF-8 encoded strings on standard char types in C++ source code like so:

char* twochars = \"\\xe6\\x97\\xa5\\xd1\\x88\";
         


        
2条回答
  •  我在风中等你
    2020-12-19 04:34

    There's no way to write the string literal directly in UTF-8 with the current versions of VC++. A future version should have UTF-8 string literals.

    I tried pasting non-ASCII text directly into a string literal in a source file and saved the file as UTF-8. Looking at the source file in a hex editor confirmed that it's saved as UTF-8, but that still doesn't do what you want. At compile time, those bytes are either mapped to a character in the current code page or you get a warning.

    So the most portable way to create a string literal right now is to explicitly write the octets as you've been doing.

    If you want to do a run-time conversion, there are a couple options.

    1. The Windows API has WideCharToMultiByte, which can take a text as UTF-16 and convert it to multibyte encodings like UTF-8.
    2. If you're using a new enough version of the compiler and the C++ runtime, you can use std::codecvt to transform your wide character string into UTF-8.

    You could use one of these techniques to write a little utility that does the conversion and outputs them as the explicit octets you would need for a string literal. You could then copy and paste the output into your source code.

提交回复
热议问题