| Summary: | Eclipse hangs on full rebuild | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Andre Weinand <andre_weinand> | ||||||
| Component: | Resources | Assignee: | John Arthorne <john.arthorne> | ||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||
| Severity: | critical | ||||||||
| Priority: | P3 | CC: | douglas.pollock, Michael.Valenta, Tod_Creasey | ||||||
| Version: | 3.0 | ||||||||
| Target Milestone: | 3.0 M8 | ||||||||
| Hardware: | All | ||||||||
| OS: | All | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Andre Weinand
Created attachment 8775 [details]
full thread dump
I forgt to mention: After the full rebuild had started in background, I changed a Java file and saved it. Anything in your log file around that time? No, nothing. The UI thread is trying to modify the workspace, and thus must wait for the autobuild to complete. The autobuild has finished, and it is in the middle of trying to do a syncExec. The UI thread should be responding to the syncExecs since we check for this while waiting on the join. My only possible hunch is that you are seeing something like bug 55637, where it is not hung, but there is tons of work happening in syncExecs and the UI thread is processing them all while waiting for the build to complete. Michael Valenta and I have tracked down a deadlock with a very similar stack trace during execution of the Team/CVS automated tests. In both cases it looks on the surface like a classic deadlock - UI thread is waiting for a lock (in this case a join) - Thread owning the lock is trying to syncExec The UILockListener/UISynchronizer hooks are designed to solve this deadlock, but it didn't handle the case where there was a nested wait inside a syncExec: 1) UI attempts to acquire a lock A 2) In UILockListener.aboutToWait, the "ui" field is set to indicate that the UI thread is waiting on a lock. 3) Thread owning lock A does syncExec 1 4) UI Thread grants syncExec because it notices that it is waiting for a lock (ui field is non-null) 5) syncExec 1 attempts to acquire a different lock B 6) Lock B is granted in UI thread 8) When lock B is released, UILockListener.aboutToRelease clears the field UILockListener.ui, which records the fact that the UI thread is waiting for a lock. 9) Thread owning lock does another syncExec 10) UI thread DOES NOT service the syncExec because it believes it is not currently waiting on a lock. (UILockListener.isUIWaiting()) returns false because UI field is null). -> The system deadlocks because the UI thread has "forgetten" that it was waiting after releasing the nested lock, so it stops servicing syncExecs from the lock owning thread. Created attachment 8833 [details]
Proposed fix to UILockListener
Proposed fix. Fix is to re-assign the UILockListener.ui field after servicing
pending work (sync execs), in case the nested execution acquired another lock
and cleared the field in the process. This guarantees that the UI thread knows
it is waiting before it falls asleep on the lock it is waiting for.
I have tried the above fix with the CVS test cases that previously locked every time and they ran to completion with no failures. John: Tod just noticed that we've been tracking a similar bug. The bug number is Bug 55714. We have a lot less specific information than is provided here. However, if it is the same bug, we've been seeing it a *lot*. I don't think so. I think most of the swirl in that bug is due to bug 55605. The only stack trace that was attached looked completely different (from Chris, an SWT hang). A fix has been released for the next 3.0 M8 integration build (thanks Doug). I will mark fixed, but please reopen if experienced again on I20040325 or greater |