Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 366423

Summary: Enable configuration of service wait timeout
Product: [RT] Virgo Reporter: malik Mising name <malikconfi>
Component: runtimeAssignee: Glyn Normington <glyn.normington>
Status: CLOSED FIXED QA Contact:
Severity: enhancement    
Priority: P3 CC: b.kapukaranov, glyn.normington, hsiliev
Version: unspecified   
Target Milestone: 3.5.0.M02   
Hardware: PC   
OS: All   
Whiteboard:

Description malik Mising name CLA 2011-12-12 10:25:27 EST
in some slow computer it take more then 30 seconds to load some bundles
(which is the defult timeout for bundle to load in virgo)  
what make virgo to crash/stop working.

and it can be helpful to modify the timeout for it. 

you can see this 2 post:

http://www.eclipse.org/forums/index.php/t/265574/

http://www.eclipse.org/forums/index.php/t/263091/
Comment 1 Glyn Normington CLA 2011-12-13 05:20:45 EST
This timeout is hard-coded in org.eclipse.virgo.kernel.userregion.internal.Activator (MAX_SECONDS_WAIT_FOR_SERVICE) and is actually a service wait timeout rather than a bundle "loading" timeout.
Comment 2 Glyn Normington CLA 2011-12-13 11:24:23 EST
In preparation for fixing this bug, I did a search for all occurrences of Thread.sleep in Virgo and found four uses which may need configuring:

org.eclipse.virgo.kernel.userregion.internal.Activator has a service wait of 30 seconds as noted earlier.

org.eclipse.virgo.kernel.userregionfactory.Activator has a service wait of 30 seconds. It would make sense to configure this using the same configuration value as the userregion service wait.

org.eclipse.virgo.kernel.osgicommand.Activator has a service wait of 20 seconds which could also be configured using the same configuration value as the userregion service wait.

The hot deployer's WatchTask has a fixed scan period of 1 second. No-one has complained about this, so I don't propose to make this configurable until there is a real need to.

(For future reference, here's the search I used including filters for the hits which were not relevant: http://virgo-opengrok.springsource.org/search?q=sleep&defs=&refs=&path=virgo+-medic+-supportability+-repository+-test+-util+-build.xml+-ide+-build-jetty-server+-ivy+-system-verification-tests+-dojo&hist=)
Comment 3 Glyn Normington CLA 2011-12-16 11:07:14 EST
To minimise the number of configuration options, I propose to reuse the org.eclipse.virgo.kernel.startup.wait.limit option described in the User Guide. It is currently documented as:

"Specifies the amount of time, in seconds, after which Virgo times out while trying to start the kernel."

with a default value of 180. This value would be a factor of 6 larger than the current service wait timeout and so would make Virgo run ok out of the box for moderately slow systems. For extremely slow systems, it would seem reasonable to give the kernel longer to start as well as increasing the service wait timeout.

I would propose to change the above wording to:

"Specifies the amount of time, in seconds, after which Virgo times out while trying to start the kernel or when waiting for certain key services during startup."

Thoughts anyone?
Comment 4 Borislav Kapukaranov CLA 2011-12-16 12:16:33 EST
Sounds much better and cleaner than introducing a new property.

Apart from that doesn't reusing the property mean we will always wait for kernel services less than 180 seconds. Not that it matters - it should solve the problem anyway but is an interesting observation.
The 180 seconds wait is triggered at start of the StartupTracker, which comes a few seconds before any bundle starts waiting for kernel services. This also means that potentially the only timeout-related kernel failure will be because of the StartupTracker's timeout and not a service's one. Good that we have event logging for delayed services.

I'm thinking doesn't the old wording actually give a more accurate information to the user. 
After all waiting for key kernel services is an internal detail. The only result users will see when configuring the value of this property is a yes/no answer to the question "Does the server start on my system without timeouts?". Effectively by increasing the value they may achieve success by giving the services more time to register but is that something the user really should be aware of? 
Wdyt?
Comment 5 Glyn Normington CLA 2011-12-19 04:33:00 EST
(In reply to comment #4)
> I'm thinking doesn't the old wording actually give a more accurate information
> to the user. 
> After all waiting for key kernel services is an internal detail. The only
> result users will see when configuring the value of this property is a yes/no
> answer to the question "Does the server start on my system without timeouts?".
> Effectively by increasing the value they may achieve success by giving the
> services more time to register but is that something the user really should be
> aware of? 
> Wdyt?

There are two ways to parse org.eclipse.virgo.kernel.startup.wait.limit.

The first is (org.eclipse.virgo).(kernel.startup.wait.limit). The kernel startup is externally visible through the event log messages (kernel starting/kernel started) before the user region initialisation gets underway, so while I agree that users shouldn't have to worry about internal details, I think the event log messages turn the "pure" kernel startup into an external.

The second is (org.eclipse.virgo.kernel).(startup.wait.limit) which is more logical because it is a kernel property and it can cover any aspects of startup waiting that we consider necessary. Users probably want to gloss over the distinction between kernel startup and user region initialisation, so this is a better fit. There is of course a minor risk that users will expect this value to limit the overall startup wait time, which it won't, so I'll need to make that clear in the docs.

So, prompted by your comment above, I'll take the second approach.
Comment 6 Glyn Normington CLA 2012-01-11 09:48:17 EST
The kernel commits (0d91c0b6348dc3bf8741bb191c334f06452dd5d5, 3afeef8e38b2ddf49d261263c3bf761ed3b984bc) add a property, so we'll need to add the corresponding property into config.ini or some such when we merge those commits into the nano branch from master.

(The new integration test ServiceUtilsTests checks the property value is correctly configured in the test config, so we should get a test failure if we forget.)

The documentation commit 3bfc5ab26e026a82a3ae0d88e21553dbab51b6dd was pushed to the nano branch for ease of merging.
Comment 7 Borislav Kapukaranov CLA 2012-01-12 11:59:41 EST
Added the property change to config.ini in the nano repo with commit 9f8fefd