| Summary: | [jobs] Test failures in refactoring tests: "Error changing from state: 16" | ||
|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Markus Keller <markus.kell.r> |
| Component: | Runtime | Assignee: | John Arthorne <john.arthorne> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | major | ||
| Priority: | P2 | CC: | andre_weinand, benno.baumgartner, daniel_megert, john.arthorne |
| Version: | 3.2 | ||
| Target Milestone: | 3.2 M6 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
|
Description
Markus Keller
This happens a lot on my machine, and it's an XP SP2 running on Intel P4... JRE: j9n142-20050609 no VM args Interesting ... there haven't been more than cosmetic changes in the jobs implementation in 3.2, so something new in the tests or the code being tested must be inducing this. I won't be able to investigate until Monday. Benno, can you give me hints on how you easily reproduce this? Is there a particular test that you run that can reproduce? I have run AllRefactoringTests a couple of times without getting the failure, but it takes so long to run that it's very hard to track it down. I think this is a bug that I've been tracking for months but have never been able to reproduce myself. I added the extra error checking to help track it down, but if I could reproduce it locally I'm sure I could make better progress. With: Version: 3.2.0 Build id: I20060301-0800 And JRE (other vms seam to work fine): C:\java\j9n142-20050609\bin>java -version java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2) Classic VM (build 1.4.2, J2RE 1.4.2 IBM Windows 32 build cn142-20050609 (JIT ena bled: jitc)) And running: org.eclipse.jdt.ui.tests.core.CallHierarchyTest I get at least one of the above exception in about 80% of the runs. Hope that helps... Thanks Benno. I found that VM, but still no luck reproducing. Two more questions: - Are you using -Xj9 VM argument? If not, does it still fail with that arg? - Is your machine a dual processor? >Are you using -Xj9 VM argument? If not, does it still fail with that arg? I don't use the argument. But it also fails with the arg. >Is your machine a dual processor? Yes, it is a dual processor machine. I have created a JUnit test that reproduces the problem: org.eclipse.core.tests.runtime.jobs.Bug_129551 Here is what happens in this test: 1) Job1 and Job2 are created and given the same scheduling rule 2) Job1 is scheduled, followed by Job2 3) Job1 is pulled from the wait queue by a worker thread and moved to the ABOUT_TO_RUN state. 4) Another worker thread dequeues Job2, notices that it conflicts with Job1, and adds it to Job1's queue of blocked jobs. 5) Job1 is put to sleep before it starts to run. The sleep queue uses the same next and previous fields used to maintain the list of blocked jobs, so it causes an assertion error. It is a very small timing hole in the time between the job enters the ABOUT_TO_RUN state and when it actually runs. If the job is put to sleep in this small period of time, and another job is blocked on it, it will cause the failure. It seems on a multi-processor machine with a particular VM, this timing hole is big enough for the failure to happen. I have sent Benno a patch to verify my fix. Looks like you fixed it! I can't reproduce it any longer with patch you did send me. Even after 10 runs I did not get any exception. Excellent! Thank you very much Benno for your help in tracking this down and verifying the fix. I have been trying to figure this one out for several months (the bug likely existed since 3.0). The fix and automated test have been released. |