Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 298416

Summary: BundleContext.installBundle is not atomic
Product: [Eclipse Project] Equinox Reporter: David Kemper <djk>
Component: FrameworkAssignee: Thomas Watson <tjwatson>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: bugs.eclipse.org, remy.suen
Version: 3.5.1   
Target Milestone: 3.6 M6   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
patch none

Description David Kemper CLA 2009-12-22 11:10:51 EST
Build Identifier: M20090917-0800

In the OSGi Core Spec r4.2 Section 4.4.3 "Installing Bundles" the specification states that the installation of a bundle in the framework must be:

Persistent – The bundle must remain installed across Framework and Java VM invocations until it is explicitly uninstalled.

Atomic – The install method must completely install the bundle or, if the installation fails, the OSGi Service Platform must be left in the same state as it was in before the method was called.

According to Tom Watson, the Equinox implementation defers writing its state to disk in a background thread that in its default configuration can take up to 30 seconds to flush to disk. (The time between flushes can be configured to narrow the gap, but because it is in another thread, the delay cannot be eliminated.) Thus, though it IS persistent eventually, it is NOT atomic, since the installation is not persisted to disk in the scope of the installBundle call.

Even adding a shutdown hook to "guarantee" state persistence can fail, for example some time after installing a bundle you run out of memory (and thus don't have the memory to perform writes to disk).

I certainly understand the desire to consolidate state writes in another thread for performance, but from the way I read the specification this can't be spec-compliant. IMHO the *default* implementation should write state out synchronously.

Reproducible: Always

Steps to Reproduce:
1. Start OSGi using equinox from the command line
2. Connect to the OSGi console
3. Install a bundle using the "install" command (the bundle should be installed and persistent)
4. *Immediately* exit the framework with the "exit" command
5. Restart the OSGi framework, and verify in the console that the installed bundle is no longer there.
Comment 1 Thomas Watson CLA 2010-02-08 17:01:00 EST
Created attachment 158535 [details]
patch

Here is a patch that allows the configuration setting eclipse.stateSaveDelayInterval=0 to indicate that each persistent framework change is saved without delay.

I also added a shutdown hook to persist the framework data when persistent state changes are delayed.
Comment 2 Thomas Watson CLA 2010-02-08 17:01:55 EST
patch released.
Comment 3 Thomas Watson CLA 2010-02-08 17:42:21 EST
There was a bug in the attached patch.

The if (joinWith != null) statement needs brackets because of the added debug trace statments:

if (joinWith != null) {
	if (Debug.DEBUG && Debug.DEBUG_GENERAL)
		Debug.println("About to join saving thread"); //$NON-NLS-1$
	// There should be no deadlock when 'shutdown' is true.
	joinWith.join();
	if (Debug.DEBUG && Debug.DEBUG_GENERAL)
		Debug.println("Joined with saving thread"); //$NON-NLS-1$
}

I released the patch with this fixed.  Otherwise NPEs could happen at joinWith.join();
Comment 4 Thomas Watson CLA 2010-02-10 12:24:12 EST
.
Comment 5 David Kemper CLA 2010-02-10 12:44:20 EST
The shutdown hook should take care of the 95% case, where the user ctrl-c's from invoking the framework, or the JVM shutting down.

How is the delay surfaced to the caller? Is

"eclipse.stateSaveDelayInterval=0"

a framework property?

I assume we would then need to make sure all our invocations of Equinox have this property set to guard against JVM crashes.

(I realize that for practical considerations you probably don't back your state with a complete, fault-tollerant DB ;-)
Comment 6 Thomas Watson CLA 2010-02-10 14:03:31 EST
(In reply to comment #5)
> The shutdown hook should take care of the 95% case, where the user ctrl-c's
> from invoking the framework, or the JVM shutting down.
> 
> How is the delay surfaced to the caller? Is
> 
> "eclipse.stateSaveDelayInterval=0"
> 
> a framework property?

Yes, you can place that in your config.ini if you are using the equinox launcher.  If using the standard OSGi FrameworkFactory then you have to set this property as a framework configuration property.

> 
> I assume we would then need to make sure all our invocations of Equinox have
> this property set to guard against JVM crashes.

Yes, but be aware that it has performance costs since large amounts of writes will occur when installing large sets of bundles.

> 
> (I realize that for practical considerations you probably don't back your state
> with a complete, fault-tollerant DB ;-)

No but we do safe-guard as much as possible with the org.eclipse.osgi.framework.internal.reliablefile.ReliableFile implementation which does lots of extra work to avoid partial writes during power outages and crashes.