I'm trying to read a binary file containing 100.000 different objects.
buffering a simple text file with the same content takes only 2MB with a BufferedReader.
But reading the binary files takes up to 700 MB and i get OutOfMemory error if I increase the number of objects to read.
So how to read the file and get the objects one by one without saturating the memory?
Here is the code I'm testing:
public static void main(String[] args) throws Exception {
int i = 0;
String path = "data/file.bin";
InputStream file = new FileInputStream(path);
InputStream buffer = new BufferedInputStream(file);
ObjectInputStream in = new ObjectInputStream(buffer);
Object obj = null;
while( ( obj = in.readObject() ) != null && i < 100000 ){
String str = obj.toString();
System.out.println( str );
i++;
}
timeTkken();
}
// Function to get the amount of time/memory used by the script
private static final long startTime = System.currentTimeMillis();
private static final long MEGABYTE = 1024L * 1024L;
public static void timeTkken(){
Runtime runtime = Runtime.getRuntime();
long endTime = System.currentTimeMillis();
long memory = runtime.totalMemory() - runtime.freeMemory();
long megabytes = memory / MEGABYTE;
System.out.println("It took " + megabytes + "mb in " + ( (endTime - startTime) /1000 ) + "s ("+ memory + (" bytes in ") + (endTime - startTime) + " ms)");
}
As far as I know, ObjectInputStream
keeps all the objects in cache until the stream is closed. So if your binary file is ~207 MB, then real objects in java heap may easily take several GBs of RAM and they can't be garbage collected. Here the question appears: Do you need all of your data to be held in RAM simultaneously?
If no (you want to read an object, process it somehow, discard it and move to the next object), I would suggest using DataInputStream
instead of ObjectInputStream
. I don't know if this approach is applicable in your case since I don't know the structure of your data. If your data is a collection of records of the same structure, you may do the following:
public class MyObject {
private int age;
private String name;
public MyObject(int age, String name) {
this.age = age;
this.name = name;
}
}
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream("path.to.file")));
// suppose that we store the total number of objects in the first 4 bytes of file
int nObjects = in.readInt();
for (int i = 0; i < nObjects; i++) {
MyObject obj = new MyObject(in.readInt(), in.readUTF());
// do some stuff with obj
}
来源:https://stackoverflow.com/questions/49533217/read-binary-file-with-a-buffer