Buffer a large file; BufferedInputStream limited to 2gb; Arrays limited to 2^31 bytes

只谈情不闲聊 提交于 2019-12-20 03:19:12

问题


I am sequentially processing a large file and I'd like to keep a large chunk of it in memory, 16gb ram available on a 64 bit system.

A quick and dirty way is to do this, is simply wrap the input stream into a buffered input stream, unfortunately, this only gives me a 2gb buffer. I'd like to have more of it in memory, what alternatives do I have?


回答1:


How about letting the OS deal with the buffering of the file? Have you checked what the performance impact of not copying the whole file into JVMs memory is?

EDIT: You could then use either RandomAccessFile or the FileChannel to efficiently read the necessary parts of the file into the JVMs memory.




回答2:


Have you considered the MappedByteBuffer in java.nio? It's over my head but maybe it is what you are looking for.




回答3:


I doubt that buffering more than 2gb at a time is going to be a huge win anyway. Depending on the amount of processing you're doing, you might be able to read in nearly as fast as you process. To speed it up, you might try using a two-threaded producer-consumer model (one thread reads the file and hands the data off to the other thread for processing).




回答4:


The OS is going to cache as much of the file as it can, so trying to outsmart the cache manager probably isn't going to get you very much.

From a performance perspective, you will be much better served by keeping the bytes outside the JVM (transferring huge chunks of data between the OS and JVM is relatively slow). You can achieve this goal by using a MappedByteBuffer backed by a direct memory block.

Here's a pertinent how-to type of article: article




回答5:


I think there are 64 bit JVMs that will support nonstandard limits.

You might try buffering chunks.



来源:https://stackoverflow.com/questions/141987/buffer-a-large-file-bufferedinputstream-limited-to-2gb-arrays-limited-to-231

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!