问题
I'm gettint a large amount of data from a database query and I'm making objects of them. I finally have a list of these objects (about 1M of them) and I want to serialize that to disk for later use. Problem is that it barely fits in memory and won't fit in the future, so I need some system to serialize say the first 100k, the next 100k etc; and also to read the data back in in 100k increments.
I could make some obvious code that checks if the list gets too big and then wirites it to file 'list1', then 'list2' etc but maybe there's a better way to handle this?
回答1:
You could go through the list, create an object, and then feed it immediately to an ObjectOutputStream which writes them to the file.
回答2:
Read the objects one by one from the DB
Don't put them into a list but write them into the file as you get them from the DB
Never keep more than a single object in RAM. When you read the object, terminate the reading loop when readObject()
returns null
(= End of file)
回答3:
I guess that you checked, it's really necessary to save the data to disk. It couldn't stay in the database, could it?
To handle data that is too big, you need to make it smaller :-)
One idea is to get the data by chunks:
- start with the request, so you don't build this huge list (because that will become a point of failure sooner or later)
- serialize your smaller list of objects
- then loop
回答4:
Think about setting the fetch size for the JDBC driver also, for example the JDBC driver for mysql defaults to fetching the whole resultset.
read here for more information: fetch size
回答5:
It seems that you are retreiving a large dataset from db and convert them into list of objects and serialize them in a single shot.
Dont do that.. finally it may lead to application crash.
Instead you have to
- minimize the amount of data retrieved from database. (let say 1000 records instead 1 M)
- convert them into business object
- And serialize them.
- And perform the same procedure until the last record
this way you can avoid the performance problem.
回答6:
ObjectOutputStream will work but it has more overhead. I think DataOutputStream/DataInputStream
is a better choice.
Just read/write one by one and let stream worry about buffering. For example, you can do something like this,
DataOutputStream os = new DataOutputStream(new FileOutputStream("myfile"));
for (...)
os.writeInt(num);
One Gotcha with both object and data stream is that write(int) only writes one byte. Please use writeInt(int).
来源:https://stackoverflow.com/questions/1464764/serializing-a-very-large-list