Community
Participate
Working Groups
Test failure in N20050918-0010, Mac OSX: testJobFamilyFind Failure Timeout waiting for job to start junit.framework.AssertionFailedError: Timeout waiting for job to start at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForStart(IJobManagerTest.java:1644) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyFind(IJobManagerTest.java:458) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:330) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:204) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:226) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:377) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:164) at org.eclipse.core.launcher.Main.invokeFramework(Main.java:338) at org.eclipse.core.launcher.Main.basicRun(Main.java:282) at org.eclipse.core.launcher.Main.run(Main.java:977) at org.eclipse.core.launcher.Main.main(Main.java:952)
Test failure in N20051007 MacOSX: Timeout waiting for job to start junit.framework.AssertionFailedError: Timeout waiting for job to start at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForStart(IJobManagerTest.java:1644) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyCancel(IJobManagerTest.java:326) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:330) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:204) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:226) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:377) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:164) at org.eclipse.core.launcher.Main.invokeFramework(Main.java:338) at org.eclipse.core.launcher.Main.basicRun(Main.java:282) at org.eclipse.core.launcher.Main.run(Main.java:977) at org.eclipse.core.launcher.Main.main(Main.java:952)
Another Mac failure on I20051101-0010 (note this is an error, not a test assertion failure: testJobFamilyNULL Error assertion failed: org.eclipse.core.internal.runtime.AssertionFailedException: assertion failed: at org.eclipse.core.internal.runtime.Assert.isTrue(Assert.java:107) at org.eclipse.core.internal.runtime.Assert.isTrue(Assert.java:93) at org.eclipse.core.internal.jobs.JobQueue.enqueue(JobQueue.java:73) at org.eclipse.core.internal.jobs.JobManager.changeState(JobManager.java:279) at org.eclipse.core.internal.jobs.JobManager.sleep(JobManager.java:978) at org.eclipse.core.internal.jobs.JobManager.sleep(JobManager.java:990) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyNULL(IJobManagerTest.java:907) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:330) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:204) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:226) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:376) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:165) at org.eclipse.core.launcher.Main.invokeFramework(Main.java:338) at org.eclipse.core.launcher.Main.basicRun(Main.java:282) at org.eclipse.core.launcher.Main.run(Main.java:977) at org.eclipse.core.launcher.Main.main(Main.java:952)
I have found a place in JobManager that could be responsible for this failure. In JobManager.endJob, the completed job is detached from the linked list of blocked jobs by calling job.setPrevious(null). However, since the list is doubly linked, the first blocked job to be removed will relink the finished job to the next blocked job in the list. Thus the completed job ends up having a dangling "previous" pointer. I have released a fix for this and we will see if it fixes this test failure.
Note to self: This hasn't been released yet because of the runtime refactoring.
Failed again in N20051209: testJobFamilyCancel Failure Timeout waiting for job to start. Job: TestSecondFamily(40), state: 2 Console output: [java] !ENTRY org.eclipse.core.jobs 4 1 2005-12-09 02:18:14.484 [java] !MESSAGE Worker thread ended job: testBeginRuleNoEnd(24), but still holds rule: ThreadJob(testBeginRuleNoEnd(24),[PathRule(testBeginRuleNoEnd),]) [java] **** BEGIN DUMP JOB MANAGER INFORMATION **** [java] TestSecondFamily(42) state: RUNNING [java] TestSecondFamily(44) state: WAITING [java] TestSecondFamily(46) state: WAITING [java] TestSecondFamily(48) state: WAITING [java] TestSecondFamily(50) state: WAITING [java] TestSecondFamily(52) state: WAITING [java] TestSecondFamily(54) state: WAITING [java] TestSecondFamily(56) state: WAITING [java] TestSecondFamily(58) state: WAITING [java] Compacting memory(10) state: SLEEPING [java] **** END DUMP JOB MANAGER INFORMATION ****
Failed in I20060207 on Linux. I have disabled the test on non-windows platforms for now because after significant investigation I was unable to find the cause of the failure. Running the test independently, it fails roughly once in a hundred executions. On the test machine, it seems to happen more frequently. testJobFamilyFind Failure Timeout waiting for job to start. Job: TestThirdFamily(122), state: 2 junit.framework.AssertionFailedError: Timeout waiting for job to start. Job: TestThirdFamily(122), state: 2 at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForStart(IJobManagerTest.java:1657) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyFind(IJobManagerTest.java:457) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:330) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:204) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:109) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:92) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:68) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:374) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:169) at org.eclipse.core.launcher.Main.invokeFramework(Main.java:338) at org.eclipse.core.launcher.Main.basicRun(Main.java:282) at org.eclipse.core.launcher.Main.run(Main.java:977) at org.eclipse.core.launcher.Main.main(Main.java:952)
This is possibly a duplicate of bug 129551. I will reenable the tests in the nightly builds to confirm.
I wonder what is the status of this bug? There is still trace of this bug # in the IJobManagerTest and some of the tests are still executed only on Windows. One such test has failed in the exact same way as described.
Danail, what do you mean by "One such test has failed..." Failed on Windows, or on other platforms? What build? There have been some bug fixes in jobs in 3.3 that might have solved this problem. When I have some time I may re-enable these tests again in the nightly build and see how it goes.
The test failed on Windows machine. The code under test is 3.2.2. However, that doesn't mean anything, so we can ignore this:) I wondered whether these bugs are fixed in 3.3. I expected to see that they are reenabled, but they weren't. I will keep you informed with any progress I make.
I20071031-0800, win32: testJobFamilyCancel Failure Timeout waiting for job to start. Job: TestSecondFamily(43), state: 2 junit.framework.AssertionFailedError: Timeout waiting for job to start. Job: TestSecondFamily(43), state: 2 at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForRunCount(IJobManagerTest.java:1682) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForStart(IJobManagerTest.java:1692) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyCancel(IJobManagerTest.java:326) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:354) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:206) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethod(EclipseAppContainer.java:570) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:195) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:106) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:76) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:362) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:175) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:515) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:455) at org.eclipse.equinox.launcher.Main.run(Main.java:1193) at org.eclipse.equinox.launcher.Main.main(Main.java:1169) at org.eclipse.core.launcher.Main.main(Main.java:30)
I have released some changes to testJobFamilyCancel so that we still get some useful testing of this API, while hopefully avoiding the intermittent failure. I have re-enabled the test on all platforms.
Timeout waiting for job to start. Job: TestThirdFamily(73), state: 2 junit.framework.AssertionFailedError: Timeout waiting for job to start. Job: TestThirdFamily(73), state: 2 at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForRunCount(IJobManagerTest.java:1680) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.waitForStart(IJobManagerTest.java:1690) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyFind(IJobManagerTest.java:459) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:354) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:206) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethod(EclipseAppContainer.java:547) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:195) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:106) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:76) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:362) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:175) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:549) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:504) at org.eclipse.equinox.launcher.Main.run(Main.java:1236) at org.eclipse.equinox.launcher.Main.main(Main.java:1212) at org.eclipse.core.launcher.Main.main(Main.java:30)
testJobFamilyWakeUp Failure 5.0: expected state: NONE actual state: RUNNING junit.framework.AssertionFailedError: 5.0: expected state: NONE actual state: RUNNING at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.assertState(IJobManagerTest.java:87) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyWakeUp(IJobManagerTest.java:1126) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:354) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:206) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethod(EclipseAppContainer.java:547) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:195) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:107) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:77) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:362) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:175) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:549) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:504) at org.eclipse.equinox.launcher.Main.run(Main.java:1236) at org.eclipse.equinox.launcher.Main.main(Main.java:1212) at org.eclipse.core.launcher.Main.main(Main.java:30)
I have disabled the above two failing tests for now. This is likely caused by bug 211799.
Now that bug 211799 is fixed, I will start re-enabling these tests to see if they fail again.
I have re-enabled all tests on all platforms. We'll see what happens...
So far so good. Marking fixed!
From N20080308: 5.0: expected state: NONE actual state: RUNNING junit.framework.AssertionFailedError: 5.0: expected state: NONE actual state: RUNNING at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.assertState(IJobManagerTest.java:87) at org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyWakeUp(IJobManagerTest.java:1122) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:354) at org.eclipse.test.EclipseTestRunner.run(EclipseTestRunner.java:206) at org.eclipse.test.CoreTestApplication.runTests(CoreTestApplication.java:35) at org.eclipse.test.CoreTestApplication.run(CoreTestApplication.java:31) at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethod(EclipseAppContainer.java:547) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:195) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:362) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:175) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:549) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:504) at org.eclipse.equinox.launcher.Main.run(Main.java:1236) at org.eclipse.equinox.launcher.Main.main(Main.java:1212) at org.eclipse.core.launcher.Main.main(Main.java:30)
(In reply to comment #19) > From N20080308: > > 5.0: expected state: NONE actual state: RUNNING > > junit.framework.AssertionFailedError: 5.0: expected state: NONE actual state: > RUNNING > at > org.eclipse.core.tests.runtime.jobs.IJobManagerTest.assertState(IJobManagerTest.java:87) > at > org.eclipse.core.tests.runtime.jobs.IJobManagerTest.testJobFamilyWakeUp(IJobManagerTest.java:1122) I investigated this and I think that the problem lies in JobManager.cancel() method and JobManager.startJob(). In JobManager.startJob(), after the job is determined to be canceled before it is run, the state of the job is updated outside the synchronization block (in endJob() method). Meanwhile, the test has called cancel(), which returns true, because the state is ABOUT_TO_RUN. The test checks immediately after returning from cancel(), whether the state is NONE, but there is a chance that this check is done after job is determined to be ended, but before the state of the job is updated. I'm not sure what is the proper fix here: 1) change the cancel() method to return false in ABOUT_TO_RUN case 2) change the startJob() method to update the state of the job in synchronized block 3) fix the test to expect this situation (properly not very good idea:)
I can provide patch for this bug, but I need to know what is the preferred way of fixing this.
I'm leaning towards solution 1), which is also likely the safest fix. The ABOUT_TO_RUN state is an internal state that clients don't know about, but essentially this means the job has been removed from the wait queue, and there is a worker thread available that is about to run the job. I think it's fair (and safe) to return false in this case, which is the result the client would get about 1ms later when the thread re-enters the sync block to enter the running state. This is a great find... this failure has been a mystery to me for three years and I worried it represented a deeper synchronization problem that I hadn't tracked down.
Created attachment 115249 [details] The one-liner fix for the problem Here is the one-liner fix for the problem as you suggested. Taking the other road is rather problematic. I have tested the fix and it hasn't failed for 500 passes, whereas before it failed once in 100 passes.
Thanks Danail, I'll get this released.
Fix released and test re-enabled. Thanks Danail!