Vietnamese character in .NET Console Application (UTF-8)

十年热恋 提交于 2019-12-23 08:09:08

问题


I'm trying to write down a UTF-8 string (Vietnamese) into C# Console but no success. I'm running on Windows 7.

I tried to use the Encoding class that convert string to char[] to byte[] and then to String, but no help, the string is input directly from the database.

Here is some example

Tôi tên là Đức, cuộc sống thật vui vẻ tuyệt vời

It does not show the special character like Đ or ... instead it show up ?, much worse than with the Encoding class.

Does anyone can try this out or know about this problem?


My code

static void Main(string[] args)
{
    XDataContext _new = new XDataContext();
    Console.OutputEncoding = Encoding.GetEncoding("UTF-8");
    string srcString = _new.Posts.First().TITLE;

    Console.WriteLine(srcString);
    // Convert the UTF-16 encoded source string to UTF-8 and ASCII.
    byte[] utf8String = Encoding.UTF8.GetBytes(srcString);
    byte[] asciiString = Encoding.ASCII.GetBytes(srcString);

    // Write the UTF-8 and ASCII encoded byte arrays. 
    Console.WriteLine("UTF-8  Bytes: {0}", BitConverter.ToString(utf8String));
    Console.WriteLine("ASCII  Bytes: {0}", BitConverter.ToString(asciiString));


    // Convert UTF-8 and ASCII encoded bytes back to UTF-16 encoded  
    // string and write.
    Console.WriteLine("UTF-8  Text : {0}", Encoding.UTF8.GetString(utf8String));
    Console.WriteLine("ASCII  Text : {0}", Encoding.ASCII.GetString(asciiString));

    Console.WriteLine(Encoding.UTF8.GetString(utf8String));
    Console.WriteLine(Encoding.ASCII.GetString(asciiString));
}

and here is the outstanding output

Nhà báo đi hội báo Xuân
UTF-8  Bytes: 4E-68-C3-A0-20-62-C3-A1-6F-20-C4-91-69-20-68-E1-BB-99-69-20-62-C3-
A1-6F-20-58-75-C3-A2-6E
ASCII  Bytes: 4E-68-3F-20-62-3F-6F-20-3F-69-20-68-3F-69-20-62-3F-6F-20-58-75-3F-
6E
UTF-8  Text : Nhà báo đi hội báo Xuân
ASCII  Text : Nh? b?o ?i h?i b?o Xu?n
Nhà báo đi hội báo Xuân
Nh? b?o ?i h?i b?o Xu?n


Press any key to continue . . .

回答1:


class Program
{
    [DllImport("kernel32.dll")]
    static extern bool SetConsoleOutputCP(uint wCodePageID);

    static void Main(string[] args)
    {
        SetConsoleOutputCP(65001);
        Console.OutputEncoding = Encoding.UTF8;
        Console.WriteLine("tést, тест, τεστ, ←↑→↓∏∑√∞①②③④, Bài viết chọn lọc");
        Console.ReadLine();
    }
}

Screenshot of the output (use Consolas or another font that has all the above characters):




回答2:


You will need to set Console.OutputEncoding to match UTF-8.

Probably something like:

Console.OutputEncoding = System.Text.Encoding.UTF8;



回答3:


Does the font you use in the Console window support the characters you are trying to display?




回答4:


it is the problem with cmd.exe console. It doesn't support unicode. [Nothing to do with C#/.NET]

Try changing it to a GUI app if you can or write to a file.



来源:https://stackoverflow.com/questions/2213541/vietnamese-character-in-net-console-application-utf-8

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!