I am reading files in various formats and languages and I am currently using a small encoding library to take attempt to detect the proper encoding (http://www.codeproject.c
Beware of the infamous 'Notepad bug'. It's going to bite you whatever you try, though... You can find some good discussions about encodings and their challenges on MSDN (and other places).