Community
Participate
Working Groups
I just tried using Kepler for the first time and immediately run into a deadlock :-(. The jstack utility shows a deadlock with 3 threads all threads blocked in the JobManager class. This is with build R-4.3-201306052000
Created attachment 233001 [details] stack traces of deadlock as reported by jstack (win32) Search for "Found one Java-level deadlock" at the end
The problem seems to be in Worker-3. There is a call to com.accurev.ide.eclipse.ResourcesManager$3.runInWorkspace(ResourcesManager.java:416) which tries to join a job. By comparing hash codes you can see that 0x1314f5b8 belongs to AutoBuildJob (see Worker-4). AutoBuildJob is an internal job and it is not exposed anywhere, so the question is: how is it possible that com.accurev.ide.eclipse.ResourcesManager$3.runInWorkspace has a reference to AutoBuildJob to be able to call AutoBuildJob.join()? The only technical way to do that is that com.accurev.ide.eclipse.ResourcesManager$3.runInWorkspace could be using Eclipse internal classes to get access to AutoBuildJob and that is not supported. Please file a bug against vendor of com.accurev.ide.eclipse.* to investigate this issue.
(In reply to comment #2) > The problem seems to be in Worker-3. There is a call to > com.accurev.ide.eclipse.ResourcesManager$3.runInWorkspace(ResourcesManager. > java:416) which tries to join a job. By comparing hash codes you can see > that 0x1314f5b8 belongs to AutoBuildJob (see Worker-4). AutoBuildJob is an > internal job and it is not exposed anywhere, Can't they get it from the IJobManager (depending on when they call the methods, it could be returned from there or passed in an IJobChangeListener) PW
(In reply to comment #3) > Can't they get it from the IJobManager (depending on when they call the > methods, it could be returned from there or passed in an IJobChangeListener) > > PW You're right. I will prepare a test case for that.
Created attachment 233526 [details] Patch Job is an API class so it can be used as a synchronization object by clients. We shouldn't perform any locking directly on a job since we can get deadlocks like the one in comment 1. The patch solves the deadlock by adding a helper object for synchronization purposes. Test for this bug is also added. There is also another place where we synchronize on a job internally: ThreadJob.run(IProgressMonitor) and ThreadJob.isRunning(). This is an internal job implementation not exposed to clients so I haven't changed that to use the new object, but it may be worth to consider changing it as well.
Thanks for the patch Szymon. For future patches, we need you to "sign off" on the commit to indicate accepting the Eclipse CoO [1]. This can be done either of the following ways: - add your sign-off in a commit, and push the commit to gerrit for review - Attach patch in the git format-patch format with the required header. For example if you create the change in a new commit in master you can generate a patch with required headers as follows: git format-patch HEAD~1 --stdout -s
(In reply to comment #6) > Thanks for the patch Szymon. For future patches, we need you to "sign off" > on the commit to indicate accepting the Eclipse CoO [1]. This can be done > either of the following ways: > > - add your sign-off in a commit, and push the commit to gerrit for review > - Attach patch in the git format-patch format with the required header. For > example if you create the change in a new commit in master you can generate > a patch with required headers as follows: > > git format-patch HEAD~1 --stdout -s Ok, will do.
I took a slightly different approach for a fix. The only reason for the synchronization there is to lazily initialize the listener list object. The reason for that is to avoid the extra overhead of an object per job if it is not needed. Adding a lock object defeats the purpose of that. So I have removed the synchronization, and eagerly initialized the listener list instead: http://git.eclipse.org/c/platform/eclipse.platform.runtime.git/commit/?id=3a89bc4229d1c798e8e72d58624be452dc177f28
Changing assignee to Szymon because he did most of the work here.