Namely, how would you tell an archive (jar/rar/etc.) file from a textual (xml/txt, encoding-independent) one?
Have a look at the JMimeMagic library.
jMimeMagic is a Java library for determining the MIME type of files or streams.