PHP and C++ for UTF-8 code unit in reverse order in Chinese character

北城余情 提交于 2019-11-28 14:38:31

They're both correct. The difference is in endian-ness.

My guess is that UTF-16 will output the string as little-endian by default. You can enforce big-endianness by using UTF-16BE instead.

That, or the exact reverse ;)

Note that these are not unicode codepoints, but rather the UTF-16BE/LE/UCS-2 byte representation. Codepoints are a different set of numbers.

EDIT: Using UTF-16LE in mb_convert_encoding will give you to the reverse representation.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!