Handling special characters in C (UTF-8 encoding)

后端 未结 4 1456
天涯浪人
天涯浪人 2020-12-07 21:17

I\'m writing a small application in C that reads a simple text file and then outputs the lines one by one. The problem is that the text file contains special characters like

4条回答
  •  抹茶落季
    2020-12-07 22:01

    Probably your text file is ISO-8559-1 encoded but your terminal is UTF-8. This kind of mismatch is a standard problem when dealing with byte-oriented text handling; other C programs (such as the standard ‘cat’ and ‘more’ commands) will do the same thing and it isn't generally considered an error or something that needs to be fixed.

    If you want to operate on a Unicode character level instead of bytes that's fine, but you'll need to use wchar as your character type instead of char throughout your program, and provide switches for the user to specify what the incoming file encoding actually is. (Whilst it is sometimes possible to guess, it's not very reliable.)

提交回复
热议问题