Community
Participate
Working Groups
From what I can see, the default timeout, per test suite, is 7200000, I am assuming that's millisec's used in ant's exec task or similar, somewhere. So, that'd be 7200 seconds, or 120 minutes, or 2 hours per test suite. That seems excessive, especially trying to get the tests running smoothly again. If, for example, 5 tests hang for some reason, there's 10 hours delay right there. I suggest we crank this _way_ down, say to 15 minutes, or 900000 ms. I know not all tests suites can finish in in 15 minutes, but might help to "get through a few passes" and perhaps those that really know they need longer should provide this value themselves? That way the "default" is more reasonable?
FWIW, I "see" this value in locally running tests, by looking at the running processes, and seeing things like 1132 davidw 00:08 94.0 593096 129208 662804 /home/davidw/jdks/ibm-java-x86_64-60-SR9FP2/jre/bin/java -Xms40m -Xmx384m -XX:MaxPermSize=256m -DPLUGIN_PATH= -classpath /home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/plugins/org.eclipse.equinox.launcher_1.3.0.v20120308-1358.jar org.eclipse.core.launcher.Main -application org.eclipse.test.coretestapplication -data /home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/osgi_sniff_folder formatter=org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter,/home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/org.eclipse.osgi.tests.AutomatedTests.xml -testPluginName org.eclipse.osgi.tests -className org.eclipse.osgi.tests.AutomatedTests -os linux -ws gtk -arch x86_64 -consolelog -timeout 7200000 The only place I "see" it in the code, is in the org.eclipse.test bundle, as part of the library.xml file. I guess that's compiled as part of the test framework and each run would use the framework it compiled/created during that build? So, I hate to change it there .... that could impact others that use the framework. I guess I could (try, and probably) set it on the command line that invokes the tests and if we are lucky it will ant-override the setting in library.xml.
Well, since I'm experimenting, I'm going to try it _real_ low at 1 minutes, 60000 ms ... that should give a quick(er) idea if its even taking effect. I _think_ the place to set this is in the test.xml file in basebuilder. We are currently using the R4_2_primary branch for both 4.2 I builds and 3.8 I builds. Obviously, if it "works", then will increase it ... probably first to, say 15 minutes, then to 30, etc. As a possible example of a "hang", see bug 377863. (if my observations are accurate, there's 2 hours accounted for :)
I found a good place to set 'timeout'. In /org.eclipse.releng.eclipsebuilder/eclipse/buildConfigs/sdk.tests/testScripts are the 3 (or 4) files we use to launch the tests when running on hudson. And, there, there's a like similar to $vmcmd -Dosgi.os=$os -Dosgi.ws=$ws -Dosgi.arch=$arch -jar $launcher -data workspace -application org.eclipse.ant.core.antRunner -file `pwd`/test.xml $tests -Dws=$ws -Dos=$os -Darch=$arch -D$installmode=true $properties -logger org.apache.tools.ant.DefaultLogger So instead, we can add -Dtimeout=900000 (15 minutes, for now to those args, and they "take effect" during the run. (I confirmed by setting it to 1 minute on my local machine). From the few results I've peeked at, I think at least 75% of the tests normally complete in under 15 minutes, while we will be missing some perfectly valid test suites that just happen to take more than 15 minutes, we will also avoid any tests which are hanging, etc., so, I hope, this will allow us to get some "complete runs" -- that takes less than 12 or more hours -- and we can at least have some results to look at, and fine tune from there. Advice welcome.
To give another "data point" ... there's a mac test still running on eclipse.org after 16 hours ... that must be due to a number of "2 hour hangs". I'd think. (I'm going to cancel it, and start a pass with 15 minute per test limit).
There are test suites that take 2 hours - for instance jdt.core. One approach I also used was to comment out calls in the test.xml (all target) to long running suites and just run the ones with problems on Hudson. Thought I'd mention this in case this was helpful.
(In reply to comment #5) > There are test suites that take 2 hours - for instance jdt.core. One approach > I also used was to comment out calls in the test.xml (all target) to long > running suites and just run the ones with problems on Hudson. Thought I'd > mention this in case this was helpful. It is helpful. Thanks. One thing we did in WTP was (having our own custom library.xml file in control) was to define a "default time out" which we could set low (still think it was 30 minutes) and those tests that _knew_ they needed a long time, could override the default in their test.xml file. That way it avoids a number of (usually UI) tests that normally take 5 minutes from "suddenly" causing the whole process to fail because they each started to take 2 hours to time out. But ... that will be longer term enhancement. We will still focus on "getting some results" first.
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. If the bug is still relevant, please remove the "stalebug" whiteboard tag.