wchar_t is 2-bytes in visual studio and stores UTF-16. How do Unicode-aware applications work with characters above U+FFFF?
问题 We are at our company planning to make our application Unicode-aware, and we are analyzing what problems we are going to encounter. Particularly, our application will for example rely heavily on lengths of strings and we would like to use wchar_t as base character class. The problem arises when dealing with characters that must be stored in 2 units of 16 bits in UTF-16, namely characters above U+10000. Simple example: I have the UTF-8 string "蟂" (Unicode character U+87C2, in UTF-8: E8 9F 82)