How to identify the file type even though the file-extension has been changed?

廉价感情. 提交于 2019-12-10 11:24:36

问题


Files are categorized by file-extension. So my question is, how to identify the file type even the file extension has been changed.

For example, i have a video file with name myVideo.mp4, i have changed it to myVideo.txt. So if i double-click it, the preferred text editor will open the file, and won't open the exact content. But, if i play myVideo.txt in a video player, the video will be played without any problem.

I was just thinking of developing an application to determine the type of file without checking the file-extension and suggesting the software for opening the file. I would like to develop the application in Java.


回答1:


Structure, magic numbers, metadata, strings and regular expressions, heuristics and statistical analysis... the tool will only be as good as the database of rules behind it.

Try DROID (Digital Record Object IDentification tool) for identifying file types; Java, Net BSD-licensed. It is a free project of the National Archives UK, unrelated to Android. Source is available on Github and Sourceforge. DROID documentation is good.

See also Darwinsys file and libmagic.




回答2:


One of the best libraries to do this is Apache Tika. It doesn't only read the file's header, it's also capable of performing content analysis to detect the file type. Using Tika is very simple, here's an example of detecting a file's type:

import java.net.URL;
import org.apache.tika.Tika; //Including Tika

public class TestTika {

    public static void main(String[] args) {
        Tika tika = new Tika();
        String fileType = tika.detect(new URL("http://example.com/someFile.jpg"));
        System.out.println(fileType);
    }

}



回答3:


There's a tool called TrID that does what you are after - it current supports 5033 different file types - and can be trained to add new types. On *nix systems, there's also the file command, which does something similar.




回答4:


well, its like having a database of file-format you want to read without looking for extension in your app. Exactly as Linux does. So whenever you open a file, you need to check file-format database which type it belongs to. Though Not sure how will it work for different file types, but most of files have fixed header format, be it zip, pdf, mpg, avi, png, etc.. so this approach should work




回答5:


You could try MimeUtil2, but it's quite old and though not up2date. The best way is still the file extension.

But the solution from Adam is not as bad as you think. You could build your platform independent solution using a wrapper around command line calls. I think you will get much better results using this method.




回答6:


The following code snippet retrieves information about the file type

final File file = new File("file.txt");
System.out.println("File type is: " + new MimetypesFileTypeMap().getContentType(file));

Hopefully, it may help you



来源:https://stackoverflow.com/questions/15565221/how-to-identify-the-file-type-even-though-the-file-extension-has-been-changed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!