Size of wchar_t* for surrogate pair (Unicode character out of BMP) on Windows

送分小仙女□ 提交于 2019-12-04 07:55:45
Naveen

sizeof(s2) returns the number of bytes required to store the pointer s2 or any other pointer, which is 4 bytes on your system. It has nothing to do with the character(s) stored in pointed to by s2.

sizeof(wchar_t*) is the same as sizeof(void*), in other words the size of a pointer itself. That will always 4 on a 32-bit system, and 8 on a 64-bit system. You need to use wcslen() or lstrlenW() instead of sizeof():

const wchar_t* s1 = L"a"; 
const wchar_t* s2 = L"\U0002008A"; // The "Han" character 

int i1 = sizeof(wchar_t); // i1 == 2
int i2 = wcslen(s1); // i2 == 1
int i3 = wcslen(s2); // i3 == 2

Addendum to the answers.
RE: to unravel the different units used in the question's update by i1 and i2, i3.

i1 value of 2 is the size in bytes
i2 value of 1 is the size in wchar_t, IOW 4 bytes (assuming sizeof(wchar_t) is 4).
i3 value of 2 is the size in wchar_t, IOW 8 bytes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!