How can I detect the encoding/codepage of a text file

后端 未结 20 1793
梦如初夏
梦如初夏 2020-11-21 22:42

In our application, we receive text files (.txt, .csv, etc.) from diverse sources. When reading, these files sometimes contain garbage, because the

20条回答
  •  广开言路
    2020-11-21 23:30

    I was actually looking for a generic, not programming way of detecting the file encoding, but I didn't find that yet. What I did find by testing with different encodings was that my text was UTF-7.

    So where I first was doing: StreamReader file = File.OpenText(fullfilename);

    I had to change it to: StreamReader file = new StreamReader(fullfilename, System.Text.Encoding.UTF7);

    OpenText assumes it's UTF-8.

    you can also create the StreamReader like this new StreamReader(fullfilename, true), the second parameter meaning that it should try and detect the encoding from the byteordermark of the file, but that didn't work in my case.

提交回复
热议问题