Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 352684 - Implement spinning thread analyzer
Summary: Implement spinning thread analyzer
Status: CLOSED FIXED
Alias: None
Product: Jetty
Classification: RT
Component: server (show other bugs)
Version: 7.5.0   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 7.5.x   Edit
Assignee: Michael Gorovoy CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-20 19:02 EDT by Michael Gorovoy CLA
Modified: 2011-09-20 15:52 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Gorovoy CLA 2011-07-20 19:02:06 EDT
[this issue was originally opened by Simone as https://jira.codehaus.org/browse/JETTY-1396]

We have had in the past reports for threads going 100% CPU and hogging the server, but never able to track down exactly what was causing that. This issue is to write a tool that tracks down 100% CPU cases and gives us as much information as possible about them.

The idea is to poll the current threads via ThreadMXBean.getThreadCpuTime(long) (and/or related methods) to see how much a thread is busy. This needs to be done using System.nanoTime(), and not with System.currentTimeMillis().

Along with that calculation, which will tell us what thread is spinning, it will be useful to get few stack traces of that thread during the spinning period (and be sure they're taken during that spinning period) via ThreadMXBean.getThreadInfo(long).getStackTrace(). This should give more insights of where the thread is spinning. Stack traces taken at different times can be compared to see if they relate, for example if they have the same stack frames up to a certain point, then it may mean the thread is spinning inside a method that is probably the last frame in common among the stack traces.
Comment 1 Michael Gorovoy CLA 2011-07-21 14:36:15 EDT
This has been committed for Jetty v7.5.0.

Simone, I'm looking forward to your comments when you get back.
Comment 2 Simone Bordet CLA 2011-08-02 15:50:52 EDT
Michael, I reopened this bug to suggest some feature that may make the tool more advanced.

I would like a feature to print, ordered by CPU time spent, the threads every N minutes where N is configurable (or perhaps N seconds).

Also, would it be possible to have a "profiler" mode that, when enabled, will analyze the top X threads like a profiler would (e.g. take a stack trace every M millis - by sampling, and derive a call tree statistics based on class and method names found in the stack traces.

For example:

Initial observation shows 3 threads consume most CPU. Start profile mode.
Sample1.thread1 = Thread.run() + Class1.m1() + Class2.m2()
Sample1.thread2 = Thread.run() + Class2.m2()
Sample1.thread3 = Thread.run() + Class2.m3()

So the call tree shows:

Thread.run() - 100%
  Class1.m1() - 33%
    Class2.m2() - 33%
  Class2.m2() - 33%
  Class2.m3() - 33%

However, please try to see if the profile mode is too heavy (you have to finish computation before the next sample).
It's a simplified version of a real profiler, but has the benefit of sampling only few threads (and not all of them) and avoids bytecode manipulation that normally real profilers do which may slow down the application a lot.
Comment 3 Michael Gorovoy CLA 2011-08-02 21:11:30 EDT
Simone,

I've implemented dumping of the thread CPU utilization as you requested.

In regard to "profiler" mode, I am not sure it is valid to assert that if the stack traces in consecutive samples are the same, then the entire time between the two samples were taken was spent inside the same method. We will need to discuss this some more tomorrow.

-Michael
Comment 4 Michael Gorovoy CLA 2011-08-30 13:36:36 EDT
Resolving this ticket
Comment 5 Jesse McConnell CLA 2011-09-20 15:52:09 EDT
Resolved -> Closed