If I want to use C++11\'s regular expressions with unicode strings, will they work with char* as UTF-8 or do I have to convert them to a wchar_t* string?
Yes they will, this is by design of the UTF-8 encoding. Substring operations should work correctly if the string is treated as an array of bytes rather than an array of codepoints.
See FAQ #18 here: http://www.utf8everywhere.org/#faq.validation about how this is achieved in this encoding's design.