Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 377859 - Default timeout of 7200000 ms is too high?
Summary: Default timeout of 7200000 ms is too high?
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 4.2   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform-Releng-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks: 377365
  Show dependency tree
 
Reported: 2012-04-27 02:50 EDT by David Williams CLA
Modified: 2019-11-14 03:22 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2012-04-27 02:50:21 EDT
From what I can see, the default timeout, per test suite, is 7200000, I am assuming that's millisec's used in ant's exec task or similar, somewhere. 

So, that'd be 7200 seconds, or 120 minutes, or 2 hours per test suite. That seems excessive, especially trying to get the tests running smoothly again. If, for example, 5 tests hang for some reason, there's 10 hours delay right there. 

I suggest we crank this _way_ down, say to 15 minutes, or 900000 ms. 

I know not all tests suites can finish in in 15 minutes, but might help to "get through a few passes" and perhaps those that really know they need longer should provide this value themselves? That way the "default" is more reasonable?
Comment 1 David Williams CLA 2012-04-27 03:02:16 EDT
FWIW, I "see" this value in locally running tests, by looking at the running processes, and seeing things like 

1132 davidw         00:08 94.0 593096 129208 662804 /home/davidw/jdks/ibm-java-x86_64-60-SR9FP2/jre/bin/java -Xms40m -Xmx384m -XX:MaxPermSize=256m -DPLUGIN_PATH= -classpath /home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/plugins/org.eclipse.equinox.launcher_1.3.0.v20120308-1358.jar org.eclipse.core.launcher.Main -application org.eclipse.test.coretestapplication -data /home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/osgi_sniff_folder formatter=org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter,/home/shared/hudson/hudsonhome/jobs/eclipse-JUnit-Linux2/workspace/ws/2012-04-27_01-59-03/eclipse-testing/test-eclipse/eclipse/org.eclipse.osgi.tests.AutomatedTests.xml -testPluginName org.eclipse.osgi.tests -className org.eclipse.osgi.tests.AutomatedTests -os linux -ws gtk -arch x86_64 -consolelog -timeout 7200000


The only place I "see" it in the code, is in the org.eclipse.test bundle, as part of the library.xml file. I guess that's compiled as part of the test framework and each run would use the framework it compiled/created during that build? 

So, I hate to change it there .... that could impact others that use the framework. 

I guess I could (try, and probably) set it on the command line that invokes the tests and if we are lucky it will ant-override the setting in library.xml.
Comment 2 David Williams CLA 2012-04-27 03:23:11 EDT
Well, since I'm experimenting, I'm going to try it _real_ low at 1 minutes, 
60000 ms ... that should give a quick(er) idea if its even taking effect. 

I _think_ the place to set this is in the test.xml file in basebuilder. We are
currently using the R4_2_primary branch for both 4.2 I builds and 3.8 I builds. 

Obviously, if it "works", then will increase it ... probably first to, say 15
minutes, then to 30, etc. 

As a possible example of a "hang", see bug 377863. (if my observations are
accurate, there's 2 hours accounted for :)
Comment 3 David Williams CLA 2012-04-27 05:04:21 EDT
I found a good place to set 'timeout'. In 

/org.eclipse.releng.eclipsebuilder/eclipse/buildConfigs/sdk.tests/testScripts

are the 3 (or 4) files we use to launch the tests when running on hudson. And, there, there's a like similar to 

$vmcmd  -Dosgi.os=$os -Dosgi.ws=$ws -Dosgi.arch=$arch -jar $launcher -data workspace -application org.eclipse.ant.core.antRunner -file `pwd`/test.xml $tests -Dws=$ws -Dos=$os -Darch=$arch -D$installmode=true $properties -logger org.apache.tools.ant.DefaultLogger

So instead, we can add -Dtimeout=900000 (15 minutes, for now to those args, and they "take effect" during the run. (I confirmed by setting it to 1 minute on my local machine). 

From the few results I've peeked at, I think at least 75% of the tests normally complete in under 15 minutes, while we will be missing some perfectly valid test suites that just happen to take more than 15 minutes, we will also avoid any tests which are hanging, etc., so, I hope, this will allow us to get some "complete runs" -- that takes less than 12 or more hours -- and we can at least have some results to look at, and fine tune from there. 

Advice welcome.
Comment 4 David Williams CLA 2012-04-27 05:10:21 EDT
To give another "data point" ... there's a mac test still running on eclipse.org after 16 hours ... that must be due to a number of "2 hour hangs". I'd think. 

(I'm going to cancel it, and start a pass with 15 minute per test limit).
Comment 5 Kim Moir CLA 2012-04-27 08:02:05 EDT
There are test suites that take 2 hours - for instance jdt.core.  One approach I also used was to comment out calls in the test.xml (all target) to long running suites and just run the ones with problems on Hudson.  Thought I'd mention this in case this was helpful.
Comment 6 David Williams CLA 2012-04-27 12:36:15 EDT
(In reply to comment #5)
> There are test suites that take 2 hours - for instance jdt.core.  One approach
> I also used was to comment out calls in the test.xml (all target) to long
> running suites and just run the ones with problems on Hudson.  Thought I'd
> mention this in case this was helpful.

It is helpful. Thanks. 

One thing we did in WTP was (having our own custom library.xml file in control) was to define a "default time out" which we could set low (still think it was 30 minutes) and those tests that _knew_ they needed a long time, could override the default in their test.xml file. That way it avoids a number of (usually UI) tests that normally take 5 minutes from "suddenly" causing the whole process to fail because they each started to take 2 hours to time out. But ... that will be longer term enhancement. We will still focus on "getting some results" first.
Comment 7 Lars Vogel CLA 2019-11-14 03:22:07 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

If the bug is still relevant, please remove the "stalebug" whiteboard tag.