How to implement a write cache that swaps data to disk only when free memory is low

问题

I want to cache data produced by my application in memory, but if memory becomes scarce I would like to swap the data to disk.

Ideally I would want to be notified by the VM that it needs memory and write my data to disk and free some memory that way. But I don't see any way to hook myself into the VM in a manner that notifies me before an OutOfMemoryError occurs somewhere (most likely in code not related to the cache in any way).

The Reference classes in java.lang.ref do not seem to be of any use in this case, their notification mechanism (ReferenceQueue) only triggers after the reference has already been reclaimed by the GC. Then it would be too late to save the data to disk.

What alternatives are available to manage the heap memory efficiently? (do not swap to disk until absolutely unavoidable)

Edit1: In response to the comment "The OS already does that for you" - this only covers part of the issue - the amount of memory the OS can allocate is a limited resource. There are also other limits than the amount of memory available to the OS that need to be considered here:

The limit imposed by the VM's architecture (32-Bit VM)
The limit of memory that can be allocated to the VM's process (32-Bit OS)
The limit possibly imposed on the VM using the -Xmx option

Simply running the VM with unlimited heap size will not prevent it from running out of memory, even if the OS still has plenty available it may not be available to the VM for above reasons.

回答1:

I recommend you use some API calls to monitor the free memory available and act accordingly.

See this question about how to monitor the amount of free memory available to the JVM.

回答2:

You can write a thread that checks for free memory repeatedly and acts if a limit is passed.

回答3:

I would use an internal database (Derby comes to mind for development purposes, replacing it with your chosen flavor for deployment). Typically they have this functionality built in already, and you can configure how much of the database to keep cached in memory.

回答4:

That's a very difficult thing to do in pure Java, for the reasons you've already hinted at.

It is quite normal for the heap to become nearly full before GC kicks in, so the only way you can determine how much free memory is really available is to do a GC (and you don't want to do that too often). You could use the CMSInitiatingOccupancyFraction option to make sure GC happens when the perm gen is (say) 80% full - you could then assume the value of "free memory" returned by the Management API is probably about right (for values > 80%). But there's no guarantee, of course.
As you mentioned, soft references are automatically cleared by the collector before being added to the queues with which they are registered, so they aren't particularly helpful here. You could create a dummy SoftReference and use its enqueing as an indication that memory is low. But I'm not sure about timing - could you guarantee to dump all of your data to disk before the JVM runs out of memory? Probably not.

Could you instead flush your cache to disk when it reaches a certain size, e.g. if it exceeds 500MB then flush it?

Or could you use a MappedByteBuffer with a private mapping - The data won't then be flushed to disk? If I remember correctly the data you write is stored in off-heap "direct" memory (at least on Linux) and so won't consume any of your heap - but please check that. If RAM became exhausted you would of course start to use Swap.

回答5:

Have you considered using memory mapped files? See http://en.wikipedia.org/wiki/Memory-mapped_file

It solves your problem regarding not being able to access memory greater than that allocated to the VM.

来源：https://stackoverflow.com/questions/8431969/how-to-implement-a-write-cache-that-swaps-data-to-disk-only-when-free-memory-is

标签

java

memory-management

garbage-collection