How to guess the encoding of a file with no BOM in .NET?

前端 未结 8 631
野趣味
野趣味 2020-12-14 13:07

I\'m using the StreamReader class in .NET like this:

using( StreamReader reader = new StreamReader( \"c:\\somefile.html\", true ) {
    string filetext = rea         


        
8条回答
  •  爱一瞬间的悲伤
    2020-12-14 13:44

    A hacky technique might be to take an MD5 of the text, then decode the text and re-encode it in various encodings, MD5'ing each one. If one matches you guess it's that encoding.

    That's obviously too slow for something that handles a lot of files but for something like a text editor I could see it working.

    Other than that, it'll be hands dirty porting the java libraries from this post that came from the Delphi SO question, or using the IE MLang feature.

提交回复
热议问题