Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 129551

Summary: [jobs] Test failures in refactoring tests: "Error changing from state: 16"
Product: [Eclipse Project] Platform Reporter: Markus Keller <markus.kell.r>
Component: RuntimeAssignee: John Arthorne <john.arthorne>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P2 CC: andre_weinand, benno.baumgartner, daniel_megert, john.arthorne
Version: 3.2   
Target Milestone: 3.2 M6   
Hardware: All   
OS: All   
Whiteboard:

Description Markus Keller CLA 2006-02-27 06:32:34 EST
N20060226-0010

We had test failures in jdt.ui.tests.refactoring and jdt.ui.tests on MacOS X:
http://download.eclipse.org/eclipse/downloads/drops/N20060226-0010/testresults/html/org.eclipse.jdt.ui.tests.refactoring_macosx.carbon.ppc.html

I have no idea why this happened. The code in JobManager that threw the exception:

try {
	sleeping.enqueue(job);
} catch (RuntimeException e) {
	throw new RuntimeException("Error changing from state: " + oldState);
}


Error changing from state: 16

java.lang.RuntimeException: Error changing from state: 16
at org.eclipse.core.internal.jobs.JobManager.changeState(JobManager.java:310)
at org.eclipse.core.internal.jobs.JobManager.sleep(JobManager.java:1058)
at org.eclipse.core.internal.jobs.InternalJob.sleep(InternalJob.java:497)
at org.eclipse.core.runtime.jobs.Job.sleep(Job.java:624)
at org.eclipse.core.internal.events.AutoBuildJob.interrupt(AutoBuildJob.java:116)
at org.eclipse.core.internal.events.BuildManager.interrupt(BuildManager.java:761)
at org.eclipse.core.internal.resources.Workspace.prepareOperation(Workspace.java:1681)
at org.eclipse.core.internal.resources.Resource.refreshLocal(Resource.java:1396)
at org.eclipse.jdt.ui.tests.refactoring.RefactoringTest.refreshFromLocal(RefactoringTest.java:130)
at org.eclipse.jdt.ui.tests.refactoring.RefactoringTest.tearDown(RefactoringTest.java:102)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
at junit.extensions.TestSetup.run(TestSetup.java:23)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
at junit.extensions.TestSetup.run(TestSetup.java:23)
at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:330)
at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:204)
at org.eclipse.test.UITestApplication$3.run(UITestApplication.java:188)
at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:123)
at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3128)
at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2892)
at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:1927)
at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1891)
at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:423)
at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:143)
at org.eclipse.ui.internal.ide.IDEApplication.run(IDEApplication.java:107)
at org.eclipse.test.UITestApplication.runApplication(UITestApplication.java:131)
at org.eclipse.test.UITestApplication.run(UITestApplication.java:58)
at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:99)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:92)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:68)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:374)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:169)
at org.eclipse.core.launcher.Main.invokeFramework(Main.java:338)
at org.eclipse.core.launcher.Main.basicRun(Main.java:282)
at org.eclipse.core.launcher.Main.run(Main.java:977)
at org.eclipse.core.launcher.Main.main(Main.java:952)
Comment 1 Benno Baumgartner CLA 2006-03-03 05:59:09 EST
This happens a lot on my machine, and it's an XP SP2 running on Intel P4...

JRE: j9n142-20050609
no VM args
Comment 2 John Arthorne CLA 2006-03-03 15:37:22 EST
Interesting ... there haven't been more than cosmetic changes in the jobs implementation in 3.2, so something new in the tests or the code being tested must be inducing this.  I won't be able to investigate until Monday.
Comment 3 John Arthorne CLA 2006-03-06 16:27:57 EST
Benno, can you give me hints on how you easily reproduce this? Is there a particular test that you run that can reproduce?  I have run AllRefactoringTests a couple of times without getting the failure, but it takes so long to run that it's very hard to track it down.

I think this is a bug that I've been tracking for months but have never been able to reproduce myself. I added the extra error checking to help track it down, but if I could reproduce it locally I'm sure I could make better progress.
Comment 4 Benno Baumgartner CLA 2006-03-07 03:57:57 EST
With:
Version: 3.2.0
Build id: I20060301-0800

And JRE (other vms seam to work fine):
C:\java\j9n142-20050609\bin>java -version
java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2)
Classic VM (build 1.4.2, J2RE 1.4.2 IBM Windows 32 build cn142-20050609 (JIT ena
bled: jitc))

And running:
org.eclipse.jdt.ui.tests.core.CallHierarchyTest

I get at least one of the above exception in about 80% of the runs.

Hope that helps...
Comment 5 John Arthorne CLA 2006-03-07 11:19:41 EST
Thanks Benno. I found that VM, but still no luck reproducing.  Two more questions:

 - Are you using -Xj9 VM argument? If not, does it still fail with that arg?
 - Is your machine a dual processor?
Comment 6 Benno Baumgartner CLA 2006-03-07 12:25:25 EST
>Are you using -Xj9 VM argument? If not, does it still fail with that arg?

I don't use the argument. But it also fails with the arg.

>Is your machine a dual processor?

Yes, it is a dual processor machine.
Comment 7 John Arthorne CLA 2006-03-08 11:16:28 EST
I have created a JUnit test that reproduces the problem:

org.eclipse.core.tests.runtime.jobs.Bug_129551

Here is what happens in this test:

1) Job1 and Job2 are created and given the same scheduling rule
2) Job1 is scheduled, followed by Job2
3) Job1 is pulled from the wait queue by a worker thread and moved to the ABOUT_TO_RUN state.
4) Another worker thread dequeues Job2, notices that it conflicts with Job1, and adds it to Job1's queue of blocked jobs.
5) Job1 is put to sleep before it starts to run.  The sleep queue uses the same next and previous fields used to maintain the list of blocked jobs, so it causes an assertion error.

It is a very small timing hole in the time between the job enters the ABOUT_TO_RUN state and when it actually runs.  If the job is put to sleep in this small period of time, and another job is blocked on it, it will cause the failure.  It seems on a multi-processor machine with a particular VM, this timing hole is big enough for the failure to happen.

I have sent Benno a patch to verify my fix.
Comment 8 Benno Baumgartner CLA 2006-03-08 12:01:36 EST
Looks like you fixed it! I can't reproduce it any longer with patch you did send me. Even after 10 runs I did not get any exception.
Comment 9 John Arthorne CLA 2006-03-08 12:18:46 EST
Excellent!  Thank you very much Benno for your help in tracking this down and verifying the fix.  I have been trying to figure this one out for several months (the bug likely existed since 3.0).  The fix and automated test have been released.