I am reading files in various formats and languages and I am currently using a small encoding library to take attempt to detect the proper encoding (http://www.codeproject.c
You have to keep the original data as a byte array or MemoryStream you can then translate to the new encoding, once you already converted your data to a string you can't reliably return to the original representation.