How to read Cyrillic symbols from a .txt file with C#

女生的网名这么多〃 提交于 2019-12-10 22:57:01

问题


I saw similar topics but could not find a solution. My problem is that I have a .txt file in which the symbols are in Bulgarian language / which is Cyrillic /, but after trying to read them, there is no sucess. I tried to read with this code:

StreamReader reader = new StreamReader(fileName,Encoding.UTF8);

if (File.Exists(fileName))
{
    while ((line = reader.ReadLine()) != null)
    {
        Console.WriteLine(line);
    }
}

And I also changed the Encoding value to all possible , as I tried with GetEncoding(1251), which I wrote is for cyrillic. And when I save the .txt file I tried to save it with each different encoding which was there / UNICODE,UTF-8,BigEndianUnicode,ANSI / in each combination with the Encoding I am settin through the code, but again no success.

Any ideas for how to read the cyrillic symbols in the right way will be appriciated. And here is sample text for this: "Ето примерен текст."

Thanks in advance! :)


回答1:


Your problem is that the console can't show cyrillic characters. Try putting a breakpoint on the Console.WriteLine and inspect the line variable. Clearly you'll need to know the correct encoding first! :-)

If you don't trust me, try this: make a console program that does this:

string line = "Ето примерен текст"; 
Console.WriteLine(line);
return 0;

put a breakpoint on the return 0;, watch the console and watch the line variable.

I'll add that unicode consoles should be one of the "new" things in .NET 4.5

And you can try to read this page: c# unicode string output




回答2:


The problem you are having is not reading the text, but displaying it.

If your real intention is to display Unicode text in a console window, then you'll have to make a few changes. If however, you will be displaying the text in a WinForms or WPF app for instance, then you will not have problems - they work with Unicode by default.

By default, the console will not handle unicode, or use a font which has unicode glyphs. You need to do the following:

  1. Save your text file as UTF8.
  2. Start a console which is unicode enabled: cmd \u
  3. Change the font to "Lucida Sans Unicode": console window menu -> properties -> font
  4. Change the codepage to Unicode: chcp 65001
  5. Run your app.

Your characters will now be displayed correctly:



来源:https://stackoverflow.com/questions/7821118/how-to-read-cyrillic-symbols-from-a-txt-file-with-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!