How do I remove diacritics (accents) from a string in .NET?

前端 未结 20 3308
南方客
南方客 2020-11-21 05:44

I\'m trying to convert some strings that are in French Canadian and basically, I\'d like to be able to take out the French accent marks in the letters while keeping the lett

20条回答
  •  生来不讨喜
    2020-11-21 06:18

    The CodePage of Greek (ISO) can do it

    The information about this codepage is into System.Text.Encoding.GetEncodings(). Learn about in: https://msdn.microsoft.com/pt-br/library/system.text.encodinginfo.getencoding(v=vs.110).aspx

    Greek (ISO) has codepage 28597 and name iso-8859-7.

    Go to the code... \o/

    string text = "Você está numa situação lamentável";
    
    string textEncode = System.Web.HttpUtility.UrlEncode(text, Encoding.GetEncoding("iso-8859-7"));
    //result: "Voce+esta+numa+situacao+lamentavel"
    
    string textDecode = System.Web.HttpUtility.UrlDecode(textEncode);
    //result: "Voce esta numa situacao lamentavel"
    

    So, write this function...

    public string RemoveAcentuation(string text)
    {
        return
            System.Web.HttpUtility.UrlDecode(
                System.Web.HttpUtility.UrlEncode(
                    text, Encoding.GetEncoding("iso-8859-7")));
    }
    

    Note that... Encoding.GetEncoding("iso-8859-7") is equivalent to Encoding.GetEncoding(28597) because first is the name, and second the codepage of Encoding.

提交回复
热议问题