c# encoding problems (question marks) while reading file from StreamReader

此生再无相见时 提交于 2020-01-11 10:18:29

问题


I've a problem while reading a .txt file from my Windows Phone app.

I've made a simple app, that reads a stream from a .txt file and prints it.

Unfortunately I'm from Italy and we've many letters with accents. And here's the problem, in fact all accented letters are printed as a question mark.

Here's the sample code:

var resourceStream = Application.GetResourceStream(new Uri("frasi.txt",UriKind.RelativeOrAbsolute));
            if (resourceStream != null)
            {
                {
                    //System.Text.Encoding.Default, true
                    using (var reader = new StreamReader(resourceStream.Stream, System.Text.Encoding.UTF8))
                    {
                        string line;
                        line = reader.ReadLine();

                        while (line != null)
                        {
                            frasi.Add(line);
                            line = reader.ReadLine();       
                        } 
                    }
                }

So, I'm asking you how to avoid this matter.

All the best.

[EDIT:] Solution: I didn't make sure the file was encoded in UTF-8- I saved it with the correct encoding and it worked like a charm. thank you Oscar


回答1:


You need to use Encoding.Default. Change:

using (var reader = new StreamReader(resourceStream.Stream, System.Text.Encoding.UTF8))

to

using (var reader = new StreamReader(resourceStream.Stream, System.Text.Encoding.Default))



回答2:


You have commented out is what you should be using if you do not know the exact encoding of your source data. System.Text.Encoding.Default uses the encoding for the operating system's current ANSI code page and provides the best chance of a correct encoding. This should detect the current region settings/encoding and use those.

However, from MSDN the warning:

Different computers can use different encodings as the default, and the default encoding can even change on a single computer. Therefore, data streamed from one computer to another or even retrieved at different times on the same computer might be translated incorrectly. In addition, the encoding returned by the Default property uses best-fit fallback to map unsupported characters to characters supported by the code page. For these two reasons, using the default encoding is generally not recommended. To ensure that encoded bytes are decoded properly, your application should use a Unicode encoding, such as UTF8Encoding or UnicodeEncoding, with a preamble. Another option is to use a higher-level protocol to ensure that the same format is used for encoding and decoding.

Despite this, in my experience with data coming from a number of different source and various different cultures, this is the one that provides the most consistent results out-of-the-box... Esp. for the case of diacritic marks which are turned to question marks when moving from ANSI to UTF8.

I hope this helps.



来源:https://stackoverflow.com/questions/21857749/c-sharp-encoding-problems-question-marks-while-reading-file-from-streamreader

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!