Do C++11 regular expressions work with UTF-8 strings?

后端 未结 4 2187
一个人的身影
一个人的身影 2020-12-01 07:57

If I want to use C++11\'s regular expressions with unicode strings, will they work with char* as UTF-8 or do I have to convert them to a wchar_t* string?

4条回答
  •  眼角桃花
    2020-12-01 08:18

    Yes they will, this is by design of the UTF-8 encoding. Substring operations should work correctly if the string is treated as an array of bytes rather than an array of codepoints.

    See FAQ #18 here: http://www.utf8everywhere.org/#faq.validation about how this is achieved in this encoding's design.

提交回复
热议问题