ZipInputStream(InputStream, Charset) decodes ZipEntry file name falsely

喜欢而已 提交于 2019-12-19 08:06:30

问题


Java 7 is supposed to fix an old problem with unpacking zip archives with character sets other than UTF-8. This can be achieved by constructor ZipInputStream(InputStream, Charset). So far, so good. I can unpack a zip archive containing file names with umlauts in them when explicitly setting an ISO-8859-1 character set.

But here is the problem: When iterating over the stream using ZipInputStream.getNextEntry(), the entries have wrong special characters in their names. In my case the umlaut "ü" is replaced by a "?" character, which is obviously wrong. Does anybody know how to fix this? Obviously ZipEntry ignores the Charset of its underlying ZipInputStream. It looks like yet another zip-related JDK bug, but I might be doing something wrong as well.

...
zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("ISO-8859-1")
);
while ((zipEntry = zipStream.getNextEntry()) != null) {
    // wrong name here, something like "M?nchen" instead of "München"
    System.out.println(zipEntry.getName());
    ...
}

回答1:


OMG, I played around for two or so hours, but just five minutes after I finally posted the question here, I bumped into the answer: My zip file was not encoded with ISO-8859-1, but with Cp437. So the constructor call should be:

zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("Cp437")
);

Now it works like a charm. Sorry for bothering you anyway. I hope this helps someone else facing similar problems.



来源:https://stackoverflow.com/questions/11276343/zipinputstreaminputstream-charset-decodes-zipentry-file-name-falsely

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!