How to identify a zip file in java?

亡梦爱人 提交于 2019-12-05 08:41:28

I'd suggest to open a plain InputStream an reading the first few bytes (magic bytes) and not to rely on the file extension as this can be easily spoofed. Also, you can omit the overhead creating and parsing the files.

For RAR the first bytes should be 52 61 72 21 1A 07.

For ZIP it should be one of:

  • 50 4B 03 04
  • 50 4B 05 06 (empty archive)
  • 50 4B 07 08 (spanned archive).

Source: https://en.wikipedia.org/wiki/List_of_file_signatures

Another point, just looked at your code:

Why do you catch die InvalidZipException, throw it away and construct a new one? This way you lose all the information from the original exception, making it hard to debug and understand what exactly went wrong. Either don't catch it at all or, if you have to wrap it, do it right:

} catch (InvalidZipException e) {
  throw new InvalidZipException("Not a zip file", e);
}
fab says Reinstate Monica

Merging the answers of nanda & bratkartoffel.

private static boolean isArchive(File f) {
    int fileSignature = 0;
    try (RandomAccessFile raf = new RandomAccessFile(f, "r")) {
        fileSignature = raf.readInt();
    } catch (IOException e) {
        // handle if you like
    }
    return fileSignature == 0x504B0304 || fileSignature == 0x504B0506 || fileSignature == 0x504B0708;
}
nanda
RandomAccessFile raf = new RandomAccessFile(f, "r");

long n = raf.readInt();

raf.close();

if (n == 0x504B0304)

    System.out.println("Should be a zip file");

else

    System.out.println("Not a zip file");

You can see it in the following link. http://www.coderanch.com/t/381509/java/java/check-file-zip-file-java

Exception is thrown in line

ZipFile zipFile = new ZipFile(pathToFile.toFile());

That's because if a non-ZipFile is given as parameter for the ZipFileconstructor the ZipException is thrown. So you have to check before generating a new ZipFile Object if your file path points to a correct ZipFile. One solution might be to check the extension of the file path like so

 PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:*.zip");
 boolean extensionCorrect = matcher.matches(path); 

Use Apache Tika to find the exact document type.

Even if renamed file type as zip, it finds the original type.

Reference: https://www.baeldung.com/apache-tika

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!