Java program with 16GB virtual memory and growing: is it a problem?

问题

On Mac OSX 5.8 I have a Java program that runs at 100% CPU for a very long time -- several days or more (it's a model checker analyzing a concurrent program, so that's more or less expected). However, its virtual memory size, as shown in OSX's Activity Monitor, becomes enormous after a day or so: right now it's 16GB and growing. Physical memory usage is roughly stable at 1.1GB or so.

I would like to know: is the 16GB (and growing) a sign of problems that could be slowing my program?

I start the program with "java -Xmx1024m -ea"

java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-9M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)

Thanks to everyone for their suggestions. I will try the profiling suggestions given in some of the answers and come back (it may take a while because of the multi-day run times).

In answer to some of the points below, the model checker does almost no I/O (only print statements, depending on the debug settings). In the mode I'm using it has no GUI. I am not the primary author of the model checker (though I have worked on some of its internals), but I do not believe that it makes any use of JNI.[<--- edit: this is wrong, details below] It does not do any memory mapping. Also, I am not asking Oracle/Sun's JVM to create lots of threads (see below for an explanation).

The extra virtual memory has not caused the model checker to die, but based on the frequency of the printing output it gradually runs more and more slowly as the virtual memory usage increases. (Perhaps that is just because of more and more garbage collection, though.) I plan to try it on a Windows machine on Monday to see if the same problem happens.

A little extra explanation: The model checker I'm running (JPF) is itself a nearly complete JVM (written entirely in Java) that runs under Oracle/Sun's JVM. Of course, as a virtual machine, JPF is highly specialized to support model checking.

It's a bit counterintuitive, but this means that even though the program I'm model checking is designed to be multithreaded, as far as Sun's JVM is concerned there is only a single thread: the one running JPF. JPF emulates the threads my program needs as part of its model checking process.

I believe that Stephen C has pinpointed the problem; Roland Illig gave me the tools to verify it. I was wrong about the use of JNI. JPF itself doesn't use JNI, but it allows plugins and JNI was used by one of the configured plugins. Fortunately there are equivalent plugins I can use that are pure Java. Preliminary use of one of them shows no growth in virtual memory over the last few hours. Thanks to everyone for their help.

回答1:

I suspect that it is a leak too. But it can't be a leak of 'normal' memory because the -Xmx1024m option is capping the normal heap. Likewise, it won't be a leak of 'permgen' heap, because the default maximum size of permgen is small.

So I suspect it is one of the following:

You are leaking threads; i.e. threads are being created but are not terminating. They might not be active, but each thread has a stack segment (256k to 1Mb by default ... depending on the platform) that is not allocated in the regular heap.
You are leaking direct-mapped files. These are mapped to memory segments allocated by the OS outside of the regular heap. (@bestsss suggests that you look for leaked ZIP file handles, which I think would be a sub-case of this.)
You are using some JNI / JNA code that is leaking malloc'ed memory, or similar.

Either way, a memory profiler is likely to isolate the problem, or at least eliminate some of the possibilities.

A JVM memory leak is also a possibility, but it is unwise to start suspecting the JVM until you have definitively eliminated possible causes in your own code and libraries / applications that you are using.

回答2:

Since your application is not a real-time application, you can do the following:

jps -v

Get the process id from that table, and save it as pid.

jmap -histo $pid > before-gc.hgr
jmap -histo:live $pid > after-gc.hgr
jstack -v $pid > threads.txt

The threads.txt file tells you what the process is doing at the moment.

The heap usage histograms before-gc.hgr and after-gc.hgr tell you how much memory a full garbage collection could free.

Maybe you get some hints from this as to what is happening.

回答3:

Are you properly NULLing all references once you're done with the objects? I wonder if this is a simple memory leak...

I have seen this behavior in C programs that mix a large number of allocations and deallocations for objects of different sizes. You can wind up with pages of memory that are half-utilized, but don't have any holes large enough that could be used for satisfying new memory requests.

This not only causes swapping, but it destroys locality of references over time -- swapping gets worse then much worse.

The solution in C programs is usually to swap to a slab allocator, and keep objects of different sizes on different pages of memory. No more holes of odd sizes, and there's always a pointer to a space just the right size for whatever your object is.

Of course, doing this in a managed environment like Java might be difficult. One would hope the JVM is already doing it. (Given that managed environments could update the references to new memory locations and re-pack periodically, even a naive memory allocation approach ought to prevent silly holes.)

The indication that there's a problem usually comes from swap traffic. If you see a lot of swap traffic on your system even when the process's RSS (Resident Set Size) remains stable, then you could be doing tons of disk IO for operations that appear to run entirely in memory. On Linux and some other Unix systems you can find swap traffic information with the vmstat 1 command:

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 727392 351128 3846980    0    0    13    16   41   73  3  1 96  0
 0  0      0 727392 351128 3846988    0    0     0     0 1267 4742  5  1 95  0
...

The si and so columns show blocks per second. (Well, whatever time interval you've selected with vmstat 1, or vmstat 2, and so on.)

If the swap traffic is low, then you probably don't have much to worry about. If the swap traffic is high, you should definitely try to figure out why you're swapping. That takes more work. :)

回答4:

It certainly sounds like a memory leak to me. You should investigate with a heap profiler such as VisualVM (which comes with your JDK).

Edit: I overlooked your heap limit. So it can't really be a simple Java memory leak (though it might be that for some reason the heap size is ignored - still worth checking out). Another possibility is that your app is holding onto a large amount of non-heap memory resources, such as native GUI peers (windows that are hidden but not deposed), threads (thread objects are small, but have a large stack memory, which would never be released if the thread is e.g. created but not started) or IO (not sure how that could take up so much memory before running out of file handles). On a deeper level, it could be an allocation problem as sarnold wrote, or a memory leak within the JVM itself.

回答5:

If jvisualvm does not report excessive memory usage, this turns into an OS X application issue, where the application is the JVM.

I suggest you open such a question instead.

回答6:

It's probably impossible to answer this accurately without knowing exactly what the program is doing. Maybe it really just uses a lot of memory...

In any case, you can try connecting to the process using the VisualVM tool and see what's going on with memory allocation and deallocation.

Also, there are a quite a few -XX: options to debug garbage collection, they may provide even more information.

来源：https://stackoverflow.com/questions/6240985/java-program-with-16gb-virtual-memory-and-growing-is-it-a-problem

标签

java

performance

virtual-memory