Im having a problem with removing non-utf8 characters from string, which are not displaying properly. Characters are like this 0x97 0x61 0x6C 0x6F (hex representation)
Slightly different to the question, but what I am doing is to use HtmlEncode(string),
pseudo code here
var encoded = HtmlEncode(string);
encoded = Regex.Replace(encoded, "\d+?;", "");
var result = HtmlDecode(encoded);
input and output
"Headlight\x007E Bracket, { Cafe Racer<> Style, Stainless Steel 中文呢?"
"Headlight~ Bracket, { Cafe Racer<> Style, Stainless Steel 中文呢?"
I know it's not perfect, but does the job for me.