Here is my situation: I need to correctly determine which character encoding is used for given text file. Hopefully, it can correctly return one of the following types:
For starters, there's no such physical encoding as "Unicode". What you probably mean by this is UTF-16. Secondly, any file is valid in "ANSI", or any single-byte encoding for that matter. The only thing you can do is guess in the best order which is most likely to throw out invalid matches.
You should check, in this order:
If you expect UTF-16 files without BOM as well (it's possible for, for example, XML files which specify the encoding in the XML declaration), then you have to shove that rule in there as well. Though any of the above may produce a false positive, falsely identifying an ANSI file as UTF-* (though it's unlikely). You should always have metadata that tells you what encoding a file is in, detecting it after the fact is not possible with 100% accuracy.