问题
I need a way to convert special characters like this:
Helloæ
To normal characters. So this word would end up being Helloae
. So far I have tried HttpUtility.Decode
, or a method that would convert UTF8 to win1252, but nothing worked. Is there something simple and generic that would do this job?
Thank you.
EDIT
I have tried implementing those two methods using posts here on OC. Here's the methods:
public static string ConvertUTF8ToWin1252(string _source)
{
Encoding utf8 = new UTF8Encoding();
Encoding win1252 = Encoding.GetEncoding(1252);
byte[] input = _source.ToUTF8ByteArray();
byte[] output = Encoding.Convert(utf8, win1252, input);
return win1252.GetString(output);
}
// It should be noted that this method is expecting UTF-8 input only,
// so you probably should give it a more fitting name.
private static byte[] ToUTF8ByteArray(this string _str)
{
Encoding encoding = new UTF8Encoding();
return encoding.GetBytes(_str);
}
But it did not worked. The string remains the same way.
回答1:
See: Does .NET transliteration library exists?
UnidecodeSharpFork
Usage:
var result = "Helloæ".Unidecode();
Console.WriteLine(result) // Prints Helloae
回答2:
There is no direct mapping between æ
and ae
they are completely different unicode code points. If you need to do this you'll most likely need to write a function that maps the offending code points to the strings that you desire.
Per the comments you may need to take a two stage approach to this:
- Remove the diacritics and combining characters per the link to the possible duplicate
- Map any characters left that are not combining to alternate strings
switch(badChar){
case 'æ':
return "ae";
case 'ø':
return "oe";
// and so on
}
来源:https://stackoverflow.com/questions/17366978/convert-special-characters-to-normal