How to read text files with ANSI encoding and non-English letters?

前端 未结 4 995
走了就别回头了
走了就别回头了 2020-12-02 20:21

I have a file that contains non-English chars and was saved in ANSI encoding using a non-English codepage. How can I read this file in C# and see the file content correctly?

相关标签:
4条回答
  • 2020-12-02 20:37
     var text = File.ReadAllText(file, Encoding.GetEncoding(codePage));
    

    List of codepages : http://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx

    0 讨论(0)
  • 2020-12-02 20:39

    You get the question-mark-diamond characters when your textfile uses high-ANSI encoding -- meaning it uses characters between 127 and 255. Those characters have the eighth (i.e. the most significant) bit set. When ASP.NET reads the textfile it assumes UTF-8 encoding, and that most significant bit has a special meaning.

    You must force ASP.NET to interpret the textfile as high-ANSI encoding, by telling it the codepage is 1252:

    String textFilePhysicalPath = System.Web.HttpContext.Current.Server.MapPath("~/textfiles/MyInputFile.txt");
    String contents = File.ReadAllText(textFilePhysicalPath, System.Text.Encoding.GetEncoding(1252));
    lblContents.Text = contents.Replace("\n", "<br />");  // change linebreaks to HTML
    
    0 讨论(0)
  • 2020-12-02 20:44
    using (StreamWriter writer = new StreamWriter(File.Open(@"E:\Sample.txt", FileMode.Append), Encoding.GetEncoding(1250)))  ////File.Create(path)
            {
                writer.Write("Sample Text");
            }
    
    0 讨论(0)
  • 2020-12-02 20:55

    If I remember correctly the XmlDocument.Load(string) method always assumes UTF-8, regardless of the XML encoding. You would have to create a StreamReader with the correct encoding and use that as the parameter.

    xmlDoc.Load(new StreamReader(
                         File.Open("file.xml"), 
                         Encoding.GetEncoding("iso-8859-15"))); 
    

    I just stumbled across KB308061 from Microsoft. There's an interesting passage: Specify the encoding declaration in the XML declaration section of the XML document. For example, the following declaration indicates that the document is in UTF-16 Unicode encoding format:

    <?xml version="1.0" encoding="UTF-16"?>
    

    Note that this declaration only specifies the encoding format of an XML document and does not modify or control the actual encoding format of the data.

    Link Source:

    XmlDocument.Load() method fails to decode € (euro)

    0 讨论(0)
提交回复
热议问题