| Summary: | Intermittent resolver errors in kernel integration tests | ||
|---|---|---|---|
| Product: | [RT] Virgo | Reporter: | Glyn Normington <glyn.normington> |
| Component: | unknown | Assignee: | Glyn Normington <glyn.normington> |
| Status: | CLOSED WORKSFORME | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | tjwatson |
| Version: | 3.0.0.M03 | ||
| Target Milestone: | --- | ||
| Hardware: | Macintosh | ||
| OS: | Mac OS X - Carbon (unsup.) | ||
| Whiteboard: | |||
Borislav appeared to encounter this when building 3.0.0.M04. Hit the problem twice in succession. Here are the diagnostics for the second failure.
[exec] [junit] Testsuite: org.eclipse.virgo.kernel.deployer.test.TransitiveConstraintFailureDiagnosisTests
[exec] [junit] Tests run: 0, Failures: 0, Errors: 1, Time elapsed: 4.65 sec
[exec] [junit]
[exec] [junit] ------------- Standard Output ---------------
[exec] [junit] [2011-05-03 17:01:03.459] startup-tracker <KE0001I> Kernel starting.
[exec] [junit] ------------- ---------------- ---------------
[exec] [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.TransitiveConstraintFailureDiagnosisTests: Caused an ERROR
[exec] [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110503154904 failed to start.
[exec] [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110503154904 failed to start.
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189)
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159)
[exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:145)
[exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:79)
[exec] [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory.
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187)
[exec] [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:341)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:172)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:127)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
[exec] [junit] at java.security.AccessController.doPrivileged(Native Method)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
[exec] [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]"
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:339)
Added logic to take a dump if a user region bundle fails to start in kernel commit 6d0d41e010e1e1151797e7e29204b20ea5f58c4e:
try {
bundle.start();
} catch (BundleException e) {
+ // Take state dump for diagnosis of resolution failures
+ this.dumpGenerator.generateDump("User region bundle failed to start", e);
throw new BundleException("Failed to start bundle " + bundle.getSymbolicName() + " " + bundle.getVersion(), e);
}
Added a wait to give the resolution state dump contributor a chance to be registered in kernel commit 1645c9db2c853cdfe0a06e99eaffe284b55d5439:
+ // Give the resolution state dump contributor a chance to be registered.
+ try {
+ Thread.sleep(10000);
+ } catch (InterruptedException _) {
+ }
// Take state dump for diagnosis of resolution failures
this.dumpGenerator.generateDump("User region bundle failed to start", e);
throw new BundleException("Failed to start bundle " + bundle.getSymbolicName() + " " + bundle.getVersion(), e);
On the latest ripple, the bug did not reproduce. Good news for the ripple, bad news for this bug. Reproduced the failure by building the kernel a couple of times. Unfortunately, the resolution state dump contributor does not seem to have been registered in time, so the resultant dump does not include the OSGi state. Perhaps the user region bundle which registers the dump contributor is waiting for one or more services from bundles in the kernel which are waiting for the user region to be available, etc, as can be seen from the console messages below:
[exec] [junit] Testsuite: org.eclipse.virgo.kernel.deployer.test.OnStartedFailureTests
[exec] [junit] Tests run: 0, Failures: 0, Errors: 1, Time elapsed: 14.358 sec
[exec] [junit]
[exec] [junit] ------------- Standard Output ---------------
[exec] [junit] [2011-05-05 07:03:27.398] startup-tracker <KE0001I> Kernel starting.
[exec] [junit] [2011-05-05 07:03:33.133] service-monitor-thread-1 <KE0100W> Reference '&packageAdminUtil' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.PackageAdminUtil)'.
[exec] [junit] [2011-05-05 07:03:33.135] service-monitor-thread-1 <KE0100W> Reference '&packageAdminUtil' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.PackageAdminUtil)'.
[exec] [junit] [2011-05-05 07:03:33.174] service-monitor-thread-1 <KE0100W> Reference '&applicationDeployer' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.deployer.core.ApplicationDeployer)'.
[exec] [junit] [2011-05-05 07:03:33.175] service-monitor-thread-1 <KE0100W> Reference '&quasiFrameworkFactory' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.quasi.QuasiFrameworkFactory)'.
[exec] [junit] [2011-05-05 07:03:33.175] service-monitor-thread-1 <KE0100W> Reference '&userBundleContext' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(&(objectClass=org.osgi.framework.BundleContext)(org.eclipse.virgo.kernel.regionContext=true))'.
[exec] [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&runtimeArtifactModel' in bundle 'org.eclipse.virgo.kernel.model' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.deployer.model.RuntimeArtifactModel)'.
[exec] [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&osgiFramework' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.OsgiFramework)'.
[exec] [junit] [2011-05-05 07:03:33.176] service-monitor-thread-1 <KE0100W> Reference '&quasiFrameworkFactory' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.quasi.QuasiFrameworkFactory)'.
[exec] [junit] [2011-05-05 07:03:33.177] service-monitor-thread-1 <KE0100W> Reference '&importExpander' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(objectClass=org.eclipse.virgo.kernel.osgi.framework.ImportExpander)'.
[exec] [junit] [2011-05-05 07:03:33.177] service-monitor-thread-1 <KE0100W> Reference '®ion' in bundle 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110505055327' is waiting for service with filter '(&(objectClass=org.eclipse.virgo.kernel.osgi.region.Region)(org.eclipse.virgo.kernel.region.name=org.eclipse.virgo.region.user))'.
[exec] [junit] [2011-05-05 07:03:39.336] main <ME0003I> Dump 'target/serviceability/dump/2011-05-05-07-03-944' generated
[exec] [junit] ------------- ---------------- ---------------
[exec] [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.OnStartedFailureTests: Caused an ERROR
[exec] [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110505055327 failed to start.
[exec] [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110505055327 failed to start.
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189)
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159)
[exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:169)
[exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:84)
[exec] [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory.
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
[exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187)
[exec] [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:352)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:176)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:131)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
[exec] [junit] at java.security.AccessController.doPrivileged(Native Method)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
[exec] [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]"
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299)
[exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291)
[exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:343)
The user region bundle has an activator which does not wait for services -- it assumes they are already there as they are fairly primitive kernel services which pre-exist the user region -- before publishing the resolution state dump contributor service. The next step is to force a resolution error, like the ones we are seeing, in a kernel integration test and then debug why the resolution state dump contributor service is not being driven. The dump contributor is held up ultimately by the bundle which fails to resolve, so I have moved the resolution state and region digraph dump contributors to the kernel core bundle. I zapped the test setup so that a resolution failure occurred similar to the intermittent one and then verified that a state dump was taken and this could be viewed in the admin console OSGi state explorer. I noticed that the diagnostics displayed in the admin console OSGi state explorer were a little lacking, so raised bug 345563 to track. Now running lots of kernel builds in an attempt to reproduce the problem. Kernel built successfully 10 times, so closing as the problem seems to have gone into hiding. |
I've noticed this during the past few days although on one occasion it was when building SpringSource dm Server 2.0.5 so it is more likely to be environmental possibly combined with a long-standing bug rather than a new code bug. I am recording it so I, and others if they encounter the problem, can build up a pattern. Typically a deployer integration test fails with a resolution error which does not reproduce when the tests are re-run. Some basic bundle in the user region fails to deploy, so unfortunately no state dump is taken. For example: [exec] [junit] ------------- Standard Output --------------- [exec] [junit] [2011-04-13 15:06:53.474] startup-tracker <KE0001I> Kernel starting. [exec] [junit] ------------- ---------------- --------------- [exec] [junit] Testcase: org.eclipse.virgo.kernel.deployer.test.ServiceScopingTests: Caused an ERROR [exec] [junit] Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110413135357 failed to start. [exec] [junit] org.osgi.framework.BundleException: Bundle org.eclipse.virgo.kernel.userregionfactory 3.0.0.BUILD-20110413135357 failed to start. [exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:189) [exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.start(FrameworkBuilder.java:159) [exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.launchOsgi(OsgiTestRunner.java:145) [exec] [junit] at org.eclipse.virgo.test.framework.OsgiTestRunner.run(OsgiTestRunner.java:79) [exec] [junit] Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.virgo.kernel.userregionfactory.Activator.start() of bundle org.eclipse.virgo.kernel.userregionfactory. [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734) [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683) [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381) [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299) [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291) [exec] [junit] at org.eclipse.virgo.osgi.launcher.FrameworkBuilder.installAndStartBundles(FrameworkBuilder.java:187) [exec] [junit] Caused by: org.osgi.framework.BundleException: Failed to start bundle org.springframework.osgi.extender 1.2.1 [exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:341) [exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.createUserRegion(Activator.java:172) [exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.start(Activator.java:127) [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711) [exec] [junit] at java.security.AccessController.doPrivileged(Native Method) [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702) [exec] [junit] Caused by: org.osgi.framework.BundleException: The bundle "org.springframework.osgi.extender_1.2.1 [46]" could not be resolved. Reason: Missing Constraint: Import-Package: org.springframework.osgi; version="[1.2.1,1.2.1]" [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1327) [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1311) [exec] [junit] at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323) [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:299) [exec] [junit] at org.eclipse.osgi.framework.internal.core.AbstractBundle.start(AbstractBundle.java:291) [exec] [junit] at org.eclipse.virgo.kernel.userregionfactory.Activator.initialiseUserRegionBundles(Activator.java:339) [exec] [junit] [exec] [junit] [exec] [junit] [2011-04-13 15:06:55.768] System Bundle Shutdown <KE0010I> Shutdown initiated. [exec] [junit] [2011-04-13 15:06:55.774] startup-tracker <KE0006E> Kernel failed to start due to the aborted start of 'org.eclipse.virgo.kernel.deployer' version '3.0.0.BUILD-20110413135357'. I am running on Mac OS X 10.6.7 with 12 GB of RAM and a dual-core Intel Xeon 2.66 GHz processor and with: $ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode) and with: $ ant -version Apache Ant version 1.7.1 compiled on June 27 2008