I got a file with characters like these: à, è, ì, ò, ù - À. What i need to do is replace those characters with normal characters eg: à = a, è = e and so on..... This is my c
Others have commented on using a Unicode lookup table to remove Diacritics. I did a quick Google search and found this example. Code shamelessly copied, (re-formatted), and posted below:
using System;
using System.Text;
using System.Globalization;
public static class Remove
{
public static string RemoveDiacritics(string stIn)
{
string stFormD = stIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for(int ich = 0; ich < stFormD.Length; ich++) {
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
if(uc != UnicodeCategory.NonSpacingMark) {
sb.Append(stFormD[ich]);
}
}
return(sb.ToString().Normalize(NormalizationForm.FormC));
}
}
So, your code could clean the input by calling:
line = Remove.RemoveDiacritics(line);