I\'m using the StreamReader class in .NET like this:
using( StreamReader reader = new StreamReader( \"c:\\somefile.html\", true ) {
string filetext = rea
UTF-8 is designed in a way that it is unlikely to have a text encoded in an arbitrary 8bit-encoding like latin1 being decoded to proper unicode using UTF-8.
So the minimum approach is this (pseudocode, I don't talk .NET):
try: u = some_text.decode("UTF-8") except UnicodeDecodeError: u = some_text.decode("most-likely-encoding")
For the most-likely-encoding one usually uses e.g. latin1 or cp1252 or whatever. More sophisticated approaches might try & find language-specific character pairings, but I'm not aware of something that does that as a library or some such.