Community
Participate
Working Groups
[this issue was originally opened by Simone as https://jira.codehaus.org/browse/JETTY-1396] We have had in the past reports for threads going 100% CPU and hogging the server, but never able to track down exactly what was causing that. This issue is to write a tool that tracks down 100% CPU cases and gives us as much information as possible about them. The idea is to poll the current threads via ThreadMXBean.getThreadCpuTime(long) (and/or related methods) to see how much a thread is busy. This needs to be done using System.nanoTime(), and not with System.currentTimeMillis(). Along with that calculation, which will tell us what thread is spinning, it will be useful to get few stack traces of that thread during the spinning period (and be sure they're taken during that spinning period) via ThreadMXBean.getThreadInfo(long).getStackTrace(). This should give more insights of where the thread is spinning. Stack traces taken at different times can be compared to see if they relate, for example if they have the same stack frames up to a certain point, then it may mean the thread is spinning inside a method that is probably the last frame in common among the stack traces.
This has been committed for Jetty v7.5.0. Simone, I'm looking forward to your comments when you get back.
Michael, I reopened this bug to suggest some feature that may make the tool more advanced. I would like a feature to print, ordered by CPU time spent, the threads every N minutes where N is configurable (or perhaps N seconds). Also, would it be possible to have a "profiler" mode that, when enabled, will analyze the top X threads like a profiler would (e.g. take a stack trace every M millis - by sampling, and derive a call tree statistics based on class and method names found in the stack traces. For example: Initial observation shows 3 threads consume most CPU. Start profile mode. Sample1.thread1 = Thread.run() + Class1.m1() + Class2.m2() Sample1.thread2 = Thread.run() + Class2.m2() Sample1.thread3 = Thread.run() + Class2.m3() So the call tree shows: Thread.run() - 100% Class1.m1() - 33% Class2.m2() - 33% Class2.m2() - 33% Class2.m3() - 33% However, please try to see if the profile mode is too heavy (you have to finish computation before the next sample). It's a simplified version of a real profiler, but has the benefit of sampling only few threads (and not all of them) and avoids bytecode manipulation that normally real profilers do which may slow down the application a lot.
Simone, I've implemented dumping of the thread CPU utilization as you requested. In regard to "profiler" mode, I am not sure it is valid to assert that if the stack traces in consecutive samples are the same, then the entire time between the two samples were taken was spent inside the same method. We will need to discuss this some more tomorrow. -Michael
Resolving this ticket
Resolved -> Closed