GZIPInputStream is prematurely closed when reading from s3

删除回忆录丶 提交于 2019-12-08 08:10:42

问题


new BufferedReader(new InputStreamReader(
       new GZIPInputStream(s3Service.getObject(bucket, objectKey).getDataInputStream())))

creates Reader that returns null from readLine() after ~100 lines if file is greater then several MB. Not reproducible on gzip files less then 1 MB. Does anybody knows how to handle this?


回答1:


From the documentation of BufferedReader#readLine():

Returns:

A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

I would say it pretty clear what this means: The end of the file/stream has been encountered - no more data is available.

Notable quirks with the GZIP format: Multiple files can just be appended to one-another to create a larger file with multiple gzipped objects. It seems that the GZIPInputStream only reads the first of those.

That also explains why it is working for "small files". Those contain only one zipped object, so the whole file is read.

Note: If the GZIPInputStream determines undestructively that one gzip-file is over, you could just open another GZIPInputStream on the same InputStream and read multiple objects.



来源:https://stackoverflow.com/questions/31275728/gzipinputstream-is-prematurely-closed-when-reading-from-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!