Which Java thread is hogging the CPU?

前端 未结 12 1718
萌比男神i
萌比男神i 2020-12-07 07:48

Let\'s say your Java program is taking 100% CPU. It has 50 threads. You need to find which thread is guilty. I have not found a tool that can help. Currently I use the follo

相关标签:
12条回答
  • 2020-12-07 08:31

    Just run up JVisualVM, connect to your app and and use the thread view. The one which remains continually active is your most likely culprit.

    0 讨论(0)
  • 2020-12-07 08:33

    Try looking at the Hot Thread Detector plugin for visual VM -- it uses the ThreadMXBean API to take multiple CPU consumption samples to find the most active threads. It's based on a command-line equivalent from Bruce Chapman which might also be useful.

    0 讨论(0)
  • 2020-12-07 08:37

    Take a thread dump. Wait for 10 seconds. Take another thread dump. Repeat one more time. Inspect the thread dumps and see which threads are stuck at the same place, or processing the same request. This is a manual way of doing it, but often useful.

    0 讨论(0)
  • 2020-12-07 08:38

    Use ps -eL or top -H -p <pid>, or if you need to see and monitor in real time, run top (then shift H), to get the Light Weight Process ( LWP aka threads) associated with the java process.

    root@xxx:/# ps -eL
    PID LWP TTY TIME CMD
     1 1 ? 00:00:00 java
     1 7 ? 00:00:01 java
     1 8 ? 00:07:52 java
     1 9 ? 00:07:52 java
     1 10 ? 00:07:51 java
     1 11 ? 00:07:52 java
     1 12 ? 00:07:52 java
     1 13 ? 00:07:51 java
     1 14 ? 00:07:51 java
     1 15 ? 00:07:53 java
    …
     1 164 ? 00:00:02 java
     1 166 ? 00:00:02 java
     1 169 ? 00:00:02 java
    

    Note LWP= Lightweight Process; In Linux, a thread is associated with a process so that it can be managed in the kernel; LWP shares files and other resources with the parent process. Now let us see the threads that are taking most time

     1 8 ? 00:07:52 java
     1 9 ? 00:07:52 java
     1 10 ? 00:07:51 java
     1 11 ? 00:07:52 java
     1 12 ? 00:07:52 java
     1 13 ? 00:07:51 java
     1 14 ? 00:07:51 java
     1 15 ? 00:07:53 java
    

    Jstack is a JDK utility to print Java Stack; It prints thread of the form.

    Familiarize yourself with others cool JDK tools as well (jcmd jstat jhat jmap jstack etc — https://docs.oracle.com/javase/8/docs/technotes/tools/unix/)

    jstack -l <process id>

    The nid, Native thread id in the stack trace is the one that is connected to LWT in linux (https://gist.github.com/rednaxelafx/843622)

    “GC task thread#0 (ParallelGC)” os_prio=0 tid=0x00007fc21801f000 nid=0x8 runnable
    

    The nid is given in Hex; So we convert the thread id taking the most time 8,9,10,11,12,13,14,15 in DEC = 8,9,A, B,C,D,E,F in HEX.

    (note that this particular stack was taken from Java in a Docker container, with a convenient process if of 1 ) Let us see the thread with this ids..

    “GC task thread#0 (ParallelGC)” os_prio=0 tid=0x00007fc21801f000 nid=0x8 runnable
    “GC task thread#1 (ParallelGC)” os_prio=0 tid=0x00007fc218020800 nid=0x9 runnable
    “GC task thread#2 (ParallelGC)” os_prio=0 tid=0x00007fc218022800 nid=0xa runnable
    “GC task thread#3 (ParallelGC)” os_prio=0 tid=0x00007fc218024000 nid=0xb runnable
    “GC task thread#4 (ParallelGC)” os_prio=0 tid=0x00007fc218026000 nid=0xc runnable
    “GC task thread#5 (ParallelGC)” os_prio=0 tid=0x00007fc218027800 nid=0xd runnable
    “GC task thread#6 (ParallelGC)” os_prio=0 tid=0x00007fc218029800 nid=0xe runnable
    “GC task thread#7 (ParallelGC)” os_prio=0 tid=0x00007fc21802b000 nid=0xf runnable
    

    All GC related threads; No wonder it was taking lot of CPU time; But then is GC a problem here.

    Use jstat (not jstack !) utility to have a quick check for GC.

    jstat -gcutil <pid>

    0 讨论(0)
  • 2020-12-07 08:40

    Identifying which Java Thread is consuming most CPU in production server.

    Most (if not all) productive systems doing anything important will use more than 1 java thread. And when something goes crazy and your cpu usage is on 100%, it is hard to identify which thread(s) is/are causing this. Or so I thought. Until someone smarter than me showed me how it can be done. And here I will show you how to do it and you too can amaze your family and friends with your geek skills.

    A Test Application

    In order to test this, we need a test application. So I will give you one. It consists of 3 classes:

    • A HeavyThread class that does something CPU intensive (computing MD5 hashes)
    • A LightThread class that does something not-so-cpu-intensive (counting and sleeping).
    • A StartThreads class to start 1 cpu intensive and several light threads.

    Here is code for these classes:

    import java.security.MessageDigest;
    import java.security.NoSuchAlgorithmException;
    import java.util.UUID;
    
    /**
     * thread that does some heavy lifting
     *
     * @author srasul
     *
     */
    public class HeavyThread implements Runnable {
    
            private long length;
    
            public HeavyThread(long length) {
                    this.length = length;
                    new Thread(this).start();
            }
    
            @Override
            public void run() {
                    while (true) {
                            String data = "";
    
                            // make some stuff up
                            for (int i = 0; i < length; i++) {
                                    data += UUID.randomUUID().toString();
                            }
    
                            MessageDigest digest;
                            try {
                                    digest = MessageDigest.getInstance("MD5");
                            } catch (NoSuchAlgorithmException e) {
                                    throw new RuntimeException(e);
                            }
    
                            // hash the data
                            digest.update(data.getBytes());
                    }
            }
    }
    
    
    import java.util.Random;
    
    /**
     * thread that does little work. just count & sleep
     *
     * @author srasul
     *
     */
    public class LightThread implements Runnable {
    
            public LightThread() {
                    new Thread(this).start();
            }
    
            @Override
            public void run() {
                    Long l = 0l;
                    while(true) {
                            l++;
                            try {
                                    Thread.sleep(new Random().nextInt(10));
                            } catch (InterruptedException e) {
                                    e.printStackTrace();
                            }
                            if(l == Long.MAX_VALUE) {
                                    l = 0l;
                            }
                    }
            }
    }
    
    
    /**
     * start it all
     *
     * @author srasul
     *
     */
    public class StartThreads {
    
            public static void main(String[] args) {
                    // lets start 1 heavy ...
                    new HeavyThread(1000);
    
                    // ... and 3 light threads
                    new LightThread();
                    new LightThread();
                    new LightThread();
            }
    }
    

    Assuming that you have never seen this code, and all you have a PID of a runaway Java process that is running these classes and is consuming 100% CPU.

    First let's start the StartThreads class.

    $ ls
    HeavyThread.java  LightThread.java  StartThreads.java
    $ javac *
    $ java StartThreads &
    

    At this stage a Java process is running should be taking up 100 cpu. In my top I see: screenshot of top output

    In top press Shift-H which turns on Threads. The man page for top says:

       -H : Threads toggle
            Starts top with the last remembered 'H' state reversed.  When
            this  toggle is On, all individual threads will be displayed.
            Otherwise, top displays a  summation  of  all  threads  in  a
            process.
    

    And now in my top with Threads display turned ON i see: top screenshot with threads displayed

    And I have a java process with PID 28294. Lets get the stack dump of this process using jstack:

    $ jstack 28924
    2010-11-18 13:05:41
    Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode):
    
    "Attach Listener" daemon prio=10 tid=0x0000000040ecb000 nid=0x7150 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "DestroyJavaVM" prio=10 tid=0x00007f9a98027800 nid=0x70fd waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Thread-3" prio=10 tid=0x00007f9a98025800 nid=0x710d waiting on condition [0x00007f9a9d543000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at LightThread.run(LightThread.java:21)
        at java.lang.Thread.run(Thread.java:619)
    
    "Thread-2" prio=10 tid=0x00007f9a98023800 nid=0x710c waiting on condition [0x00007f9a9d644000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at LightThread.run(LightThread.java:21)
        at java.lang.Thread.run(Thread.java:619)
    
    "Thread-1" prio=10 tid=0x00007f9a98021800 nid=0x710b waiting on condition [0x00007f9a9d745000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at LightThread.run(LightThread.java:21)
        at java.lang.Thread.run(Thread.java:619)
    
    "Thread-0" prio=10 tid=0x00007f9a98020000 nid=0x710a runnable [0x00007f9a9d846000]
       java.lang.Thread.State: RUNNABLE
        at sun.security.provider.DigestBase.engineReset(DigestBase.java:139)
        at sun.security.provider.DigestBase.engineUpdate(DigestBase.java:104)
        at java.security.MessageDigest$Delegate.engineUpdate(MessageDigest.java:538)
        at java.security.MessageDigest.update(MessageDigest.java:293)
        at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:197)
        - locked <0x00007f9aa457e400> (a sun.security.provider.SecureRandom)
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:257)
        - locked <0x00007f9aa457e708> (a java.lang.Object)
        at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
        at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
        at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
        - locked <0x00007f9aa4582fc8> (a java.security.SecureRandom)
        at java.util.UUID.randomUUID(UUID.java:162)
        at HeavyThread.run(HeavyThread.java:27)
        at java.lang.Thread.run(Thread.java:619)
    
    "Low Memory Detector" daemon prio=10 tid=0x00007f9a98006800 nid=0x7108 runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "CompilerThread1" daemon prio=10 tid=0x00007f9a98004000 nid=0x7107 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "CompilerThread0" daemon prio=10 tid=0x00007f9a98001000 nid=0x7106 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Signal Dispatcher" daemon prio=10 tid=0x0000000040de4000 nid=0x7105 runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Finalizer" daemon prio=10 tid=0x0000000040dc4800 nid=0x7104 in Object.wait() [0x00007f9a97ffe000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f9aa45506b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
        - locked <0x00007f9aa45506b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
    
    "Reference Handler" daemon prio=10 tid=0x0000000040dbd000 nid=0x7103 in Object.wait() [0x00007f9a9de92000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f9aa4550318> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x00007f9aa4550318> (a java.lang.ref.Reference$Lock)
    
    "VM Thread" prio=10 tid=0x0000000040db8800 nid=0x7102 runnable 
    
    "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000040d6e800 nid=0x70fe runnable 
    
    "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000040d70800 nid=0x70ff runnable 
    
    "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000040d72000 nid=0x7100 runnable 
    
    "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000040d74000 nid=0x7101 runnable 
    
    "VM Periodic Task Thread" prio=10 tid=0x00007f9a98011800 nid=0x7109 waiting on condition 
    
    JNI global references: 910
    

    From my top I see that the PID of the top thread is 28938. And 28938 in hex is 0x710A. Notice that in the stack dump, each thread has an nid which is dispalyed in hex. And it just so happens that 0x710A is the id of the thread:

    "Thread-0" prio=10 tid=0x00007f9a98020000 nid=0x710a runnable [0x00007f9a9d846000]
       java.lang.Thread.State: RUNNABLE
        at sun.security.provider.DigestBase.engineReset(DigestBase.java:139)
        at sun.security.provider.DigestBase.engineUpdate(DigestBase.java:104)
        at java.security.MessageDigest$Delegate.engineUpdate(MessageDigest.java:538)
        at java.security.MessageDigest.update(MessageDigest.java:293)
        at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:197)
        - locked <0x00007f9aa457e400> (a sun.security.provider.SecureRandom)
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:257)
        - locked <0x00007f9aa457e708> (a java.lang.Object)
        at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:108)
        at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:97)
        at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
        - locked <0x00007f9aa4582fc8> (a java.security.SecureRandom)
        at java.util.UUID.randomUUID(UUID.java:162)
        at HeavyThread.run(HeavyThread.java:27)
        at java.lang.Thread.run(Thread.java:619)
    

    And so you can confirm that the thread which is running the HeavyThread class is consuming most CPU.

    In read world situations, it will probably be a bunch of threads that consume some portion of CPU and these threads put together will lead to the Java process using 100% CPU.

    Summary

    • Run top
    • Press Shift-H to enable Threads View
    • Get PID of the thread with highest CPU
    • Convert PID to HEX
    • Get stack dump of java process
    • Look for thread with the matching HEX PID.
    0 讨论(0)
  • 2020-12-07 08:46

    This is a kind of hacky way, but it seems like you could fire the application up in a debugger, and then suspend all the threads, and go through the code and find out which one isn't blocking on a lock or an I/O call in some kind of loop. Or is this like what you've already tried?

    0 讨论(0)
提交回复
热议问题