If I understand well, it is possible to use both string and wstring to store UTF-8 text.
With char, ASCII characters take a single byte, some chinese charac
You are correct for those:
...Which means that str[3] doesn't necessarily point to the 4th character...only use them as dummy feature-less byte arrays...
string of C++ can only handle ascii characters. This is different from the String of Java, which can handle Unicode characters. You can store the encoding result (bytes) of Chinese characters into string (char in C/C++ is just byte), but this is meaningless as string just treat the bytes as ascii chars, so you cannot use string function to process it.
wstring may be something you need.
There is something that should be clarified. UTF-8 is just an encoding method for Unicode characters (transforming characters from/to byte format).