I have been given an export from a MySQL database that seems to have had it\'s encoding muddled somewhat over time and contains a mix of HTML char codes
such as
The data is only partly unrecoverable due to Windows-1252 encoding having 5 unassigned slots. Some modifications of Windows-1252 fill these with control characters but those don't make it to posts in Stackoverflow. If modified Windows-1252 has been used you can fully recover as long as you don't lose the hidden control characters in copy pastes.
There is also the non-breaking space character that is ignored or turned into a space usually with copypastes, but that's not an issue when you deal with bytes directly.
The misencoding abuse this string has gone through is:
UTF-8 -> Windows-1252 -> UTF-8 -> Windows-1252
To recover, here is an example:
String a = "Desinfektionslösungstücher für Flächen";
Encoding utf8 = Encoding.GetEncoding(65001);
Encoding win1252 = Encoding.GetEncoding(1252);
string result = utf8.GetString(win1252.GetBytes(utf8.GetString(win1252.GetBytes(a))));
Console.WriteLine(result);
//Desinfektionslösungstücher für Flächen