Correctly reading a utf-16 text file into a string without external libraries?

后端 未结 3 1380
醉酒成梦
醉酒成梦 2020-11-29 05:24

I\'ve been using StackOverflow since the beginning, and have on occasion been tempted to post questions, but I\'ve always either figured them out myself or found answers pos

3条回答
  •  情话喂你
    2020-11-29 06:08

    When you open a file for UTF-16, you must open it in binary mode. This is because in text mode, certain characters are interpreted specially - specifically, 0x0d is filtered out completely and 0x1a marks the end of the file. There are some UTF-16 characters that will have one of those bytes as half of the character code and will mess up the reading of the file. This is not a bug, it is intentional behavior and is the sole reason for having separate text and binary modes.

    For the reason why 0x1a is considered the end of a file, see this blog post from Raymond Chen tracing the history of Ctrl-Z. It's basically backwards compatibility run amok.

提交回复
热议问题