Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 342719

Summary: Intermittent resolver errors in kernel integration tests
Product: [RT] Virgo Reporter: Glyn Normington <glyn.normington>
Component: unknownAssignee: Glyn Normington <glyn.normington>
Status: CLOSED WORKSFORME QA Contact:
Severity: normal    
Priority: P3 CC: tjwatson
Version: 3.0.0.M03   
Target Milestone: ---   
Hardware: Macintosh   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard:

Description Glyn Normington CLA 2011-04-13 10:26:33 EDT
I've noticed this during the past few days although on one occasion it was when building SpringSource dm Server 2.0.5 so it is more likely to be environmental possibly combined with a long-standing bug rather than a new code bug. I am recording it so I, and others if they encounter the problem, can build up a pattern.

Typically a deployer integration test fails with a resolution error which does not reproduce when the tests are re-run. Some basic bundle in the user region fails to deploy, so unfortunately no state dump is taken.

For example:

  [exec]     [junit] ------------- Standard Output ---------------
     [exec]     [junit] [2011-04-13 15:06:53.474] startup-tracker          <KE0001I> Kernel starting. 
     [exec]     [junit] ------------- ---------------- ---------------
     [exec]     [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.ServiceScopingTests:	Caused an ERROR
     [exec]     [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110413135357 failed to start.
     [exec]     [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110413135357 failed to start.
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:145)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:79)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory.
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:341)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:172)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:127)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
     [exec]     [junit] 	at java.security.AccessController.doPrivileged(Native Method)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]"
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:339)
     [exec]     [junit] 
     [exec]     [junit] 
     [exec]     [junit] [2011-04-13 15:06:55.768] System Bundle Shutdown   <KE0010I> Shutdown initiated. 
     [exec]     [junit] [2011-04-13 15:06:55.774] startup-tracker          <KE0006E> Kernel failed to start due to the aborted start of 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110413135357'.

I am running on Mac OS X 10.6.7 with 12 GB of RAM and a dual-core Intel Xeon 2.66 GHz processor and with:

$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)

and with:

$ ant -version
Apache Ant version 1.7.1 compiled on June 27 2008
Comment 1 Glyn Normington CLA 2011-05-04 02:52:37 EDT
Borislav appeared to encounter this when building 3.0.0.M04.
Comment 2 Glyn Normington CLA 2011-05-04 02:58:58 EDT
Hit the problem twice in succession. Here are the diagnostics for the second failure.

[exec]     [junit] Testsuite: org.eclipse.virgo.kernel.deployer.test.TransitiveConstraintFailureDiagnosisTests
     [exec]     [junit] Tests run: 0, Failures: 0, Errors: 1, Time elapsed: 4.65 sec
     [exec]     [junit] 
     [exec]     [junit] ------------- Standard Output ---------------
     [exec]     [junit] [2011-05-03 17:01:03.459] startup-tracker          <KE0001I> Kernel starting. 
     [exec]     [junit] ------------- ---------------- ---------------
     [exec]     [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.TransitiveConstraintFailureDiagnosisTests:	Caused an ERROR
     [exec]     [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110503154904 failed to start.
     [exec]     [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110503154904 failed to start.
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:145)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:79)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory.
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:341)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:172)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:127)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
     [exec]     [junit] 	at java.security.AccessController.doPrivileged(Native Method)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]"
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:339)
Comment 3 Glyn Normington CLA 2011-05-04 03:13:15 EDT
Added logic to take a dump if a user region bundle fails to start in kernel commit 6d0d41e010e1e1151797e7e29204b20ea5f58c4e:

            try {
                     bundle.start();
                 } catch (BundleException e) {
+                    // Take state dump for diagnosis of resolution failures
+                    this.dumpGenerator.generateDump("User region bundle failed to start", e);
                     throw new BundleException("Failed to start bundle " + bundle.getSymbolicName() + " " + bundle.getVersion(), e);
                 }
Comment 4 Glyn Normington CLA 2011-05-04 10:41:31 EDT
Added a wait to give the resolution state dump contributor a chance to be registered in kernel commit 1645c9db2c853cdfe0a06e99eaffe284b55d5439:

+                    // Give the resolution state dump contributor a chance to be registered.
+                    try {
+                        Thread.sleep(10000);
+                    } catch (InterruptedException _) {
+                    }
                     // Take state dump for diagnosis of resolution failures
                     this.dumpGenerator.generateDump("User region bundle failed to start", e);
                     throw new BundleException("Failed to start bundle " + bundle.getSymbolicName() + " " + bundle.getVersion(), e);
Comment 5 Glyn Normington CLA 2011-05-04 12:00:52 EDT
On the latest ripple, the bug did not reproduce. Good news for the ripple, bad news for this bug.
Comment 6 Glyn Normington CLA 2011-05-05 02:20:39 EDT
Reproduced the failure by building the kernel a couple of times. Unfortunately, the resolution state dump contributor does not seem to have been registered in time, so the resultant dump does not include the OSGi state. Perhaps the user region bundle which registers the dump contributor is waiting for one or more services from bundles in the kernel which are waiting for the user region to be available, etc, as can be seen from the console messages below:

   [exec]     [junit] Testsuite: org.eclipse.virgo.kernel.deployer.test.OnStartedFailureTests
     [exec]     [junit] Tests run: 0, Failures: 0, Errors: 1, Time elapsed: 14.358 sec
     [exec]     [junit] 
     [exec]     [junit] ------------- Standard Output ---------------
     [exec]     [junit] [2011-05-05 07:03:27.398] startup-tracker          <KE0001I> Kernel starting. 
     [exec]     [junit] [2011-05-05 07:03:33.133] service-monitor-thread-1 <KE0100W> Reference '&packageAdminUtil' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.PackageAdminUtil)'. 
     [exec]     [junit] [2011-05-05 07:03:33.135] service-monitor-thread-1 <KE0100W> Reference '&packageAdminUtil' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.PackageAdminUtil)'. 
     [exec]     [junit] [2011-05-05 07:03:33.174] service-monitor-thread-1 <KE0100W> Reference '&applicationDeployer' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.deployer.core.ApplicationDeployer)'. 
     [exec]     [junit] [2011-05-05 07:03:33.175] service-monitor-thread-1 <KE0100W> Reference '&quasiFrameworkFactory' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.quasi.QuasiFrameworkFactory)'. 
     [exec]     [junit] [2011-05-05 07:03:33.175] service-monitor-thread-1 <KE0100W> Reference '&userBundleContext' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(&(objectClass=org.osgi.framework.BundleContext)(org.eclipse.virgo.kernel.regionContext=true))'. 
     [exec]     [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&runtimeArtifactModel' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.deployer.model.RuntimeArtifactModel)'. 
     [exec]     [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&osgiFramework' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.OsgiFramework)'. 
     [exec]     [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&quasiFrameworkFactory' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.quasi.QuasiFrameworkFactory)'. 
     [exec]     [junit] [2011-05-05 07:03:33.177] service-monitor-thread-1 <KE0100W> Reference '&importExpander' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.ImportExpander)'. 
     [exec]     [junit] [2011-05-05 07:03:33.177] service-monitor-thread-1 <KE0100W> Reference '&region' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(&(objectClass=org.eclipse.virgo.kernel.osgi.region.Region)(org.eclipse.virgo.kernel.region.name=org.eclipse.virgo.region.user))'. 
     [exec]     [junit] [2011-05-05 07:03:39.336] main                     <ME0003I> Dump 'target/serviceability/dump/2011-05-05-07-03-944' generated 
     [exec]     [junit] ------------- ---------------- ---------------
     [exec]     [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.OnStartedFailureTests:	Caused an ERROR
     [exec]     [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110505055327 failed to start.
     [exec]     [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110505055327 failed to start.
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:169)
     [exec]     [junit] 	at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:84)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory.
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:352)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:176)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:131)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
     [exec]     [junit] 	at java.security.AccessController.doPrivileged(Native Method)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
     [exec]     [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]"
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
     [exec]     [junit] 	at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
     [exec]     [junit] 	at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:343)
Comment 7 Glyn Normington CLA 2011-05-05 10:55:20 EDT
The user region bundle has an activator which does not wait for services -- it assumes they are already there as they are fairly primitive kernel services which pre-exist the user region -- before publishing the resolution state dump contributor service.

The next step is to force a resolution error, like the ones we are seeing, in a kernel integration test and then debug why the resolution state dump contributor service is not being driven.
Comment 8 Glyn Normington CLA 2011-05-12 06:11:02 EDT
The dump contributor is held up ultimately by the bundle which fails to resolve, so I have moved the resolution state and region digraph dump contributors to the kernel core bundle.

I zapped the test setup so that a resolution failure occurred similar to the intermittent one and then verified that a state dump was taken and this could be viewed in the admin console OSGi state explorer.

I noticed that the diagnostics displayed in the admin console OSGi state explorer were a little lacking, so raised bug 345563 to track.
Comment 9 Glyn Normington CLA 2011-05-16 11:39:46 EDT
Now running lots of kernel builds in an attempt to reproduce the problem.
Comment 10 Glyn Normington CLA 2011-05-16 12:38:52 EDT
Kernel built successfully 10 times, so closing as the problem seems to have gone into hiding.