Community
Participate
Working Groups
Created attachment 120657 [details] 6 javacore "dump" files. Recent tests in 3.0 maintenance builds have consistently resulted in DNF for some JSF tests. The apparently pass locally with 512M heap, but fail on Linux with 728M heap. = = = In the attached zips, 4 of the dumps happened "spontaneously" from running out of memory ... the other two I snagged while the process appeared 'hung' ... not sure those two reveal much ... I think at that point the test was just waiting with a dialog about "out of memory. we recommend you exit".
The same tests pass in the 3.1 builds. Are there any other difference in the setup between the two build system? We will continue to debug this issue.
(In reply to comment #1) > ... Are there any other difference in the > setup between the two build system? ... > No, well ... except one test pre-reqs Ganymede and the other Galileo ... but the build system itself is the same. Perhaps a "fix" went into 3.0.4 but not carried over to 3.1 yet? It's also odd that it seems to pass sometimes. If there's more I can to help, let me know.
Here's an oddity, in just built http://build.eclipse.org/webtools/committers/wtp-R3.0-M/20081217145642/M-3.0.4-20081217145642/ There is no "DNF" in the JUnit summary for that build, but looking at the logs, there's still lots of "out of memory" errors. Then, looking closer, I see there's the pagedesigner tests are not listed at all in the summary. I think what happened, is that the overall timelimit overall JUnit tests was hit so the whole JUnit cycle came to an abrupt end. This is noticeable too in that the "total number of tests" (8584) is less in this run, than in others (8810). That is several subsequent suites (e.g. for JSDT were not ran due to this problem).
On another, "local" machine, I get similar DNF failures, but in one case, they all worked, except one: testMultiProject which failed due to "out of memory". Just in case you wanted one test to focus on, that might be a good one?
Would it be possible to get access to the javacore dump files for the JSF test suites that failed? Test dump log: http://build.eclipse.org/webtools/committers/wtp-R3.0-M/20090127153754/M-3.0.4-20090127153754/testResults/consolelogs/testSysErrorLogs/org.eclipse.jst.pagedesigner.tests.AllTests.error.txt
I can get the dumps, but only if I get them within one hour, or so, of the build/test ... they are erased once the next one starts. I don't save them away anywhere. Was it that particular one? Or do you just want one more recent than the 6 I attached already? There does seem to me to be a real memory leak. It might be "because of the way the tests are ran", but a leak none the less. I say this since when I run the tests in my development environment, the memory just keeps going up and up, apparently with each one, but not sure that's literally true. Because this suite creates 110 projects, in quick succession, I wonder if there's some things done each import (e.g. build and validation) that never get a chance to really finish, because the next one starts up. Thus, each "job" building memory up more and more since nothing can finish. (build server is fast :) And, I think sometimes, it hits the limit, and sometimes not, depending on the semi-random method of letting some jobs run to completion, but not others. Once I saw 50 or 70 "jobs" created and displayed in the debugger, all waiting to finish something. Maybe that's normal, but I don't recall ever seeing that many. (Normally they are re-used at a faster pace than that) so there's maybe 10 or 15. That 70, btw, was on my "slow" developer machine ... the build server may stack up more, since I can think of a couple of approaches, to see if some simple "test hacks" could fix the issue. For example, before you import the next project, pause ... either sleep, or better yet, you can explicitly wait until some job finishes ... not sure which one, but the build job is typical to wait for. Another approach, even better in the long run, is to change the test so it doesn't have to create 110 projects! That's a lot. And, this test takes a long long time, compared to most. Ideally you could import one (or two :) projects, and then run all your tests on that one. I know that may not be possible, but thought it worth mentioning. If you still want a dump file, ping me and I'll keep an eye open for an opportunity.
This is going on 4 months now. Is this planning to be addressed. Failing tests or tests not completing shouldn't be targeted for Future.
These tests consistently pass on our local machines and we have not been able to determine the cause for the random failures in the build environment. We planned to revisit this if we see the failures again. We will split the tests in to smaller units to see if that addresses the issue. I am setting the target to RC1 to do this work.
We will split the tests and add them back to the build in the 3.1.2 timeframe.
These DNFs started to happen (again?) in 3.2 builds. Had they been removed from there? If so, I changed the way tests are ran (bug 295153) and might have re-enabled them. In the new scheme, to disable a test suite, you can rename the test.xml file to something like testHOLD.xml. If it has been enabled all along, but more DNFs just started happening, another thing that changed with the new test scheme, is that the order of the tests are now different. For example, if you re-use the same workspace from one test to another, it might larger than before. I doubt you re-use a workspace, but just thought I'd mention it, in case the order of your test suites might matter.
(In reply to comment #10) > These DNFs started to happen (again?) in 3.2 builds. > > Had they been removed from there? If so, I changed the way tests are ran (bug > 295153) and might have re-enabled them. > I looked, and see now these had been removed from the master list. Since we no longer have a master list, can you please disable them by renaming your 'test.xml' to 'testHOLD.xml'? You'll need to in both HEAD (3.2) and maintenance branch (3.1.x) if you have one. I haven't changed to the new builder in maintenance builds, but will soon.
I have (temporarily) hardcoded the exclusion of org.eclipse.jst.pagedesigner.tests until JSF team has time to disable it or remove it from build ... or fix! :)
(In reply to comment #12) > I have (temporarily) hardcoded the exclusion of > org.eclipse.jst.pagedesigner.tests until JSF team has time to disable it or > remove it from build ... or fix! :) Please add the test back to the build. We would like to investigate any failure.
(In reply to comment #13) > (In reply to comment #12) > > Please add the test back to the build. We would like to investigate any > failure. I have stopped excluding this test. Probably won't "show up" for its first run until Saturday, or so. Good luck.
Mass update of Helios bugs