How to read text file without knowing the encoding

元气小坏坏 提交于 2019-12-03 13:30:06
Ole Begemann

Apple's documentation has some guidance on how to proceed: String Programming Guide: Reading data with an unknown encoding:

If you are forced to guess the encoding (and note that in the absence of explicit information, it is a guess):

  1. Try stringWithContentsOfFile:usedEncoding:error: or initWithContentsOfFile:usedEncoding:error: (or the URL-based equivalents). These methods try to determine the encoding of the resource, and if successful return by reference the encoding used.

  2. If (1) fails, try to read the resource by specifying UTF-8 as the encoding.

  3. If (2) fails, try an appropriate legacy encoding. "Appropriate" here depends a bit on circumstances; it might be the default C string encoding, it might be ISO or Windows Latin 1, or something else, depending on where your data is coming from.

If the file is properly constructed you can read the first four bytes and see if it is a BOM (Byte Order Mark):

http://en.wikipedia.org/wiki/Byte-order_mark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!