Replacing characters in C# (ascii)

前端 未结 7 943
挽巷
挽巷 2020-12-05 08:26

I got a file with characters like these: à, è, ì, ò, ù - À. What i need to do is replace those characters with normal characters eg: à = a, è = e and so on..... This is my c

7条回答
  •  死守一世寂寞
    2020-12-05 09:08

    Others have commented on using a Unicode lookup table to remove Diacritics. I did a quick Google search and found this example. Code shamelessly copied, (re-formatted), and posted below:

    using System;
    using System.Text;
    using System.Globalization;
    
    public static class Remove
    {
        public static string RemoveDiacritics(string stIn)
        {
            string stFormD = stIn.Normalize(NormalizationForm.FormD);
            StringBuilder sb = new StringBuilder();
    
            for(int ich = 0; ich < stFormD.Length; ich++) {
                UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
                if(uc != UnicodeCategory.NonSpacingMark) {
                    sb.Append(stFormD[ich]);
                }
            }
    
            return(sb.ToString().Normalize(NormalizationForm.FormC));
        }
    }
    

    So, your code could clean the input by calling:

    line = Remove.RemoveDiacritics(line);
    

提交回复
热议问题