Java ZipInputStream extraction errors

耗尽温柔 提交于 2019-12-11 05:54:04

问题


Below is some code which extracts a file from a zip file containing only a single file. However, the extracted file does not match the same file extracted via WinZip or other zip utility. I expect that it might be off by a byte if the file contains an odd number of bytes (because my buffer is size 2 and I just abort once the read fails). However, when analyzing (using WinMerge or Diff) the file extracted with code below vs. file extracted via Winzip, there are several areas where bytes are missing from the Java extraction. Does anyone know why or how I can fix this?

package zipinputtest;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.zip.ZipInputStream;

public class test2 {
    public static void main(String[] args) {
        try {
            ZipInputStream zis = new ZipInputStream(new FileInputStream("C:\\temp\\sample3.zip"));
            File outputfile = new File("C:\\temp\\sample3.bin");
            OutputStream os = new BufferedOutputStream(new FileOutputStream(outputfile));
            byte[] buffer2 = new byte[2];
            zis.getNextEntry();
            while(true) {
                if(zis.read(buffer2) != -1) {
                    os.write(buffer2);
                }
                else break;
            }
            os.flush();
            os.close();
            zis.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

I was able to produce the error using this image (save it and zip as sample3.zip and run the code on it), but any binary file of sufficient size should show the discrepancies.


回答1:


while (true) {
    if(zis.read(buffer2) != -1) {
        os.write(buffer2);
    }
    else break;
}

Usual problem. You're ignoring the count. Should be:

int count;
while ((count = zis.read(buffer2)) != -1)
{
    os.write(buffer2, 0, count);
}

NB:

  1. A buffer size of 2 is ridiculous. Use 8192 or more.
  2. flush() before close() is redundant.



回答2:


You can use a more verbatim way to check whether all bytes are read and written, e.g. a method like

  public int extract(ZipInputStream in, OutputStream out) throws IOException {
    byte[] buffer = new byte[BUFFER_SIZE];
    int total = 0;
    int read;
    while ((read = in.read(buffer)) != -1) {
      total += read;
      out.write(buffer, 0, read);
    }
    return total;
  }

If the read parameter is not used in write(), the method assumes that the entire contents of the buffer will be written out, which may not be correct, if the buffer is not fully filled.

The OutputStream can be flushed and closed inside or outside the extract() method. Calling close() should be enough, since it also calls flush().

In any case, the "standard" I/O code of Java, like the java.util.zip package, have been tested and used extensively, so it is highly unlikely it could have a bug so fundamental as to cause bytes to be missed so easily.



来源:https://stackoverflow.com/questions/43508690/java-zipinputstream-extraction-errors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!