问题
We have an application running Wildfly app server in a clustered mode (6 nodes). We are seeing sometimes JVM freeze of 16secs when there is a GC triggered. The application is time sensitive and other nodes in the cluster thinks that node is dead (in which the JVM pause) if the heartbeat response is not received with in 15secs. So, The JVM freeze causing application instability. To understand what is going on during GC, we enabled hotspot, safepoint logs and see the following traces when there is a GC pause.
Can anybody explain what is meant by the following parameters.
1.) active_workers(): 13
2.) new_acitve_workers: 13
3.) prev_active_workers: 13
4.) active_workers_by_JT: 3556
5.) active_workers_by_heap_size: 146
Environment details: Linux 64bit RHEL 7 OpenJDK 1.8 Heap size: 12GB (Young:4GB, Tenure:8GB) CPU cores: 16 VMware ESX 5.1
JVM Arguments:
-XX:ThreadStackSize=512
-Xmx12288m
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-XX:MaxPermSize=1024m
-XX:+DisableExplicitGC
-XX:NewSize=4096m
-XX:MaxNewSize=4096m
-XX:ReservedCodeCacheSize=256m
-XX:+UseCodeCacheFlushing
-XX:+UseDynamicNumberOfGCThreads
Any suggestions in tuning these JVM parameters to reduce the GC pause time?
GC logs:
GCTaskManager::calc_default_active_workers() : active_workers(): 13 new_acitve_workers: 13 prev_active_workers: 13
active_workers_by_JT: 3556 active_workers_by_heap_size: 146
GCTaskManager::set_active_gang(): all_workers_active() 1 workers 13 active 13 ParallelGCThreads 13
JT: 1778 workers 13 active 13 idle 0 more 0
2016-10-06T07:38:47.281+0530: 48313.522: [Full GC (Ergonomics) DrainStacksCompactionTask::do_it which = 3 which_stack_index = 3/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 7 which_stack_index = 7/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 2 which_stack_index = 2/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 0 which_stack_index = 0/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 11 which_stack_index = 11/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 6 which_stack_index = 6/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 1 which_stack_index = 1/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 12 which_stack_index = 12/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 4 which_stack_index = 4/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 5 which_stack_index = 5/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 9 which_stack_index = 9/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 8 which_stack_index = 8/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 10 which_stack_index = 10/empty(0) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 3 region_stack = 0x780be610 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 5 region_stack = 0x780be730 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 7 region_stack = 0x780be850 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 11 region_stack = 0x780bea90 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 1 region_stack = 0x780be4f0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 10 region_stack = 0x780bea00 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 8 region_stack = 0x780be8e0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 4 region_stack = 0x780be6a0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 0 region_stack = 0x780be460 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 2 region_stack = 0x780be580 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 6 region_stack = 0x780be7c0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 12 region_stack = 0x780beb20 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 9 region_stack = 0x780be970 empty (1) use all workers 1
[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]
[Times: user=180.57 sys=2.46, real=16.09 secs]
2016-10-06T07:39:03.373+0530: 48329.615: Total time for which application threads were stopped: 16.2510644 seconds, Stopping threads took: 0.0036805 seconds
Safepoint Logs:
48313.363: ParallelGCFailedAllocation [ 2384 0 2 ] [ 0 0 3 35 16210 ] 0
Thanks in advance for your help.
回答1:
Judging by the ParallelGCFailedAllocation
and [PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]
we have the following conditions:
- YoungGen is almost empty (only 63М occupied out of 4G)
- OldGen is almost full (only 42М left out of 8,3G)
- JVM tried to move survived objects from YoungGen or failed allocate them in Survivor space and decided to move them to OldGen
- OldGen had insufficient space as well (only 42M as mentioned) so a FullGC was triggered
- After a FullGC 5G of OldGen is collected (8346270K->3657870K)
Even 13 GC threads running in parallel are collecting those 5G for 16 seconds. Since you have only 16 cores there not so many room for speed improvement from adding more threads.
The following might be happening here:
- Your objects live too long for a YounGen, therefore you would need to switch to CMS/G1 so it will collect OldGen more frequently and it will take less time in total. You would need to tune
InitiatingHeapOccupancyPercent
as per your needs. Jugding by current output you should initiate somewhere around 4G. Though it will put in question if you really need those 12G of heap, since it would be a subject to heap fragmentation issues. - Your objects are short-lived but too big to be accomodated in Survivor space, therefore you would need to tune
SurvivorRatio
parameter to make it bigger. Something like SurvivorRatio=4 (it will make it 1G in that case).
So it really depends on your object allocation pattern. The best approach would be to try it somewhere else before applying it in production.
JVM GC parameters could be found here
来源:https://stackoverflow.com/questions/39891275/jvm-flags-xxusedynamicnumberofgcthreads-xxtracedynamicgcthreads-enabled-to