| Summary: | Improve Virgo startup/deployment performance | ||
|---|---|---|---|
| Product: | [RT] Virgo | Reporter: | Peter Gardfjäll <peter.gardfjall.work> |
| Component: | unknown | Assignee: | Borislav Kapukaranov <b.kapukaranov> |
| Status: | ASSIGNED --- | QA Contact: | |
| Severity: | enhancement | ||
| Priority: | P3 | CC: | b.kapukaranov, eclipse, glyn.normington, mario.scalas |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Windows 7 | ||
| Whiteboard: | |||
| Bug Depends on: | 364571 | ||
| Bug Blocks: | |||
|
Description
Peter Gardfjäll
I did some further profiling on the time spent in different stages of the deployment pipeline for the plan file refered to in the forum thread http://www.eclipse.org/forums/index.php?t=msg&th=167723&start=0&SQ=de8b3b6f86c0ebc1d2d867af27c2eb0f: <plan name="platform.plan" version="1.0" scoped="false" atomic="true" xmlns="http://www.springsource.org/schema/dm-server/plan" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springsource.org/schema/dm-server/plan http://www.springsource.org/schema/dm-server/plan/springsource-dm-server-plan.xsd" > <artifact type="bundle" name="com.springsource.org.apache.derby" version="10.5.1000001.764942" /> <artifact type="bundle" name="org.eclipse.persistence.core" version="1.1.3.v20091002-r5404" /> <artifact type="bundle" name="org.eclipse.persistence.antlr" version="1.1.3.v20091002-r5404" /> <artifact type="bundle" name="org.eclipse.persistence.asm" version="1.1.3.v20091002-r5404" /> <artifact type="bundle" name="org.eclipse.persistence.jpa" version="1.1.3.v20091002-r5404" /> </plan> On the test setup I was using the plan took a total of 67 seconds to deploy. Of this time 23 seconds was spent in the PlanResolver transformer and 44 seconds were spent in the CommitStage. In the CommitStage, almost all time was spent in the Bundle bundle = this.bundleContext.installBundle(locationUri.toString()); statement in the StandardQuasiFramework.installQuasiBundles() method. The following times were recorded for the BundleContext.installBundle() statement for each of the plan bundles: com.springsource.org.apache.derby-10.5.1000001.764942.jar: 16 seconds org.eclipse.persistence.core-1.1.3.jar: 24 seconds org.eclipse.persistence.antlr-1.1.3.jar: 1.5 seconds org.eclipse.persistence.asm-1.1.3.jar: 1 second org.eclipse.persistence.jpa-1.1.3.jar: 2 seconds I hope that this will help identitfy the performance bottlenecks. best regards, Peter That's very helpful Peter and quite surprising. So this doesn't appear to be a resolver performance issue which was my first fear. It's fascinating that the bundle install operation takes so long. Essentially, we install the plan contents into a "side state" during plan resolution (which means expanding the tree rooted in the plan). During the commit stage, we then install the bundles in the OSGi framework. Are there any reasons why file IO should be particularly slow on your system. E.g. encrypted disk, over-zealous virus checker, remote file system, etc.? If you have the time, you could try installing the persistence core bundle into raw Equinox (launched from the command line) for comparison. You see it *appears* that the bottleneck is inside Equinox unless some of Virgo's Equinox hooks are having a nasty side-effect, which I somewhat doubt. However, it should be pretty easy to pin this down using a profiler. You didn't happen to gather profiling information for BundleContext.installBundle did you? I tried to reproduce the problem. I found that it was necessary to start the server first and then copy the Derby the Eclipse persistence JARs into repository/usr to prevent them being installed automatically as optional dependencies (presumably of Spring framework). Anyway, I did that and the copied the plan into pickup and it deployed slowly, but not *that* slowly: [2010-11-18 08:47:08.666] fs-watcher <HD0001I> Hot deployer processing 'CREATED' event for file 'slow.plan'. [2010-11-18 08:47:08.692] fs-watcher <DE0000I> Installing plan 'platform.plan' version '1.0.0'. [2010-11-18 08:47:10.242] fs-watcher <DE0000I> Installing bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 08:47:11.708] fs-watcher <DE0000I> Installing bundle 'org.eclipse.persistence.core' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:11.797] fs-watcher <DE0000I> Installing bundle 'org.eclipse.persistence.antlr' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:11.910] fs-watcher <DE0000I> Installing bundle 'org.eclipse.persistence.asm' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:12.288] fs-watcher <DE0000I> Installing bundle 'org.eclipse.persistence.jpa' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.313] fs-watcher <DE0001I> Installed bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 08:47:16.316] fs-watcher <DE0001I> Installed bundle 'org.eclipse.persistence.core' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.319] fs-watcher <DE0001I> Installed bundle 'org.eclipse.persistence.antlr' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.322] fs-watcher <DE0001I> Installed bundle 'org.eclipse.persistence.asm' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.325] fs-watcher <DE0001I> Installed bundle 'org.eclipse.persistence.jpa' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.328] fs-watcher <DE0001I> Installed plan 'platform.plan' version '1.0.0'. [2010-11-18 08:47:16.340] fs-watcher <DE0004I> Starting plan 'platform.plan' version '1.0.0'. [2010-11-18 08:47:16.344] fs-watcher <DE0004I> Starting bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 08:47:16.475] start-signalling-2 <DE0005I> Started bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 08:47:16.476] fs-watcher <DE0004I> Starting bundle 'org.eclipse.persistence.core' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.482] start-signalling-2 <DE0005I> Started bundle 'org.eclipse.persistence.core' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.482] fs-watcher <DE0004I> Starting bundle 'org.eclipse.persistence.antlr' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.488] fs-watcher <DE0004I> Starting bundle 'org.eclipse.persistence.asm' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.488] start-signalling-5 <DE0005I> Started bundle 'org.eclipse.persistence.antlr' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.494] start-signalling-2 <DE0005I> Started bundle 'org.eclipse.persistence.asm' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.494] fs-watcher <DE0004I> Starting bundle 'org.eclipse.persistence.jpa' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.535] start-signalling-2 <DE0005I> Started bundle 'org.eclipse.persistence.jpa' version '1.1.3.v20091002-r5404'. [2010-11-18 08:47:16.539] start-signalling-2 <DE0005I> Started plan 'platform.plan' version '1.0.0'. So the plan too 8 seconds in total to deploy. I haven't tried reproducing this on Windows as I don't have a Windows VM handy (I'm away from the office), so it *could* be a Windows oddity or something more specific to your environment. Apologies for the typos in comment 3: "... then copy the Derby [and] the Eclipse persistence JARs..." and "... the plan too[k] 8 seconds in total to deploy". (In reply to comment #2) I don't see any reason why file I/O would be a bottleneck on my system. I tried to install the aforementioned bundles in a raw equinox (3.6.1) setting and all bundles start up almost instantaneously (2-3 seconds at most from starting equinox until all bundles are in ACTIVE state). I ran java -jar org.eclipse.osgi_3.6.1.R36x_v20100806.jar -console -clean with the following config.ini eclipse.ignoreApp=true osgi.clean=true osgi.startLevel=6 osgi.bundles.defaultStartLevel=5 osgi.bundles=\ bundles/kernel/org.eclipse.osgi.util_3.2.0.v20090520-1800.jar@1:start,\ bundles/kernel/org.eclipse.osgi.services_3.2.0.v20090520-1800.jar@1:start,\ bundles/extras/com.springsource.org.apache.derby-10.5.1000001.764942.jar@start,\ bundles/extras/javax.persistence-1.1.3@start,\ bundles/extras/org.eclipse.persistence.antlr-1.1.3@start,\ bundles/extras/org.eclipse.persistence.asm-1.1.3@start,\ bundles/extras/org.eclipse.persistence.core-1.1.3.jar@start I also tried to install only org.eclipse.persistence.core-1.1.3.jar (after installing the prerequisites) and it installs/starts immediately. I'm afraid I haven't been using a profiler. I'm not sure if this is relevant, but I haven't been deploying the plan on the virgo web server but on a virgo kernel distribution. (In reply to comment #3) Yes, something strange must be at play here. I tried simply dropping in the derby bundle in a fresh virgo kernel distribution and as you can see from the log traces below, the kernel and user region starts up fairly quick (8.5 seconds) but the (drop-in) deployment of derby in the pickup directory takes roughly 30 seconds. [2010-11-18 18:38:25.793] startup-tracker <KE0001I> Kernel starting . ... [2010-11-18 18:38:34.291] Thread-2 <UR0001I> User region rea dy. [2010-11-18 18:41:18.334] fs-watcher <HD0001I> Hot deployer pr ocessing 'CREATED' event for file 'com.springsource.org.apache.derby-10.5.100000 1.764942.jar'. [2010-11-18 18:41:25.279] fs-watcher <DE0000I> Installing bund le 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 18:41:57.203] fs-watcher <DE0001I> Installed bundl e 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 18:41:57.219] fs-watcher <DE0004I> Starting bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. [2010-11-18 18:41:58.287] start-signalling-1 <DE0005I> Started bundle 'com.springsource.org.apache.derby' version '10.5.1000001.764942'. The slowness reproduces on a Windows VM. It took 90 seconds to deplon on the kernel. I'll look into purchasing a profiler for Windows to delve into this. Strangely enough as I repeated the bringing up of the kernel, putting the JARs in repository\usr, and deploying the slow plan to pickup the deployment process got faster. I can't explain this as the JVM has been recycled. Anyway, it still takes around 20 seconds so hopefully the profile results will still be meaningful. The whole process consumed about 9 seconds of CPU, so it seems likely that the remaining 11 seconds was mainly I/O. PlanResolver.transform and its subtree of calls accounted for 4.7 seconds of CPU and of this 4.6 seconds were spent in ZipUtils.unzipTo during staging the files to the deployer's work directory. The commit stage, installing bundles into the OSGi framework, accounted for 4.3 seconds of CPU and most of this seemed to be spent in AdaptorUtil.copyDir deep copying from the deployer's work directory to Equinox's work directory. The bundles which seem to take the most time are derby which is a 2.5 Mb JAR (unzips to 5.5 Mb) and persistence core which is a 3.6 Mb JAR (unzips to 7.8 Mb). I haven't counted the number of files in these JARs, but it must be in the thousands to account for the JAR sizes. So my conclusion so far is that file unzipping and copying is the dominant overhead. It would be a major design change to avoid staging bundles which are deployed through the deployment pipeline. So a pragmatic workaround would be to install these JARs directly into the OSGi framework rather than deploy them via the deployment pipeline. This can be achieved by putting the JARs in their own directory, perhaps lib/usr and then adding the JARs to the baseBundle property in config/org.eclipse.virgo.kernel.userregion.properties. Having enabled warm start should improve a lot the startup time after the first start. This is planned for 3.0.0 and the work on this topic is ongoing currently. As we consider warm start, some old JIRAs from dm Server days may be relevant: https://issuetracker.springsource.com/browse/DMS-449: "Warm start is incompatible with in-memory manifest transformation as currently implemented since Equinox warm start installs bundles from its persistent storage which contains untransformed manifests." Note that https://issuetracker.springsource.com/browse/DMS-417 introduced the cold starting of Equinox. IIRC the problem was that the warm start state was somehow corrupt. https://issuetracker.springsource.com/browse/DMS-395 has a stack trace which is the kind of state corruption we used to see on Equinox warm start after an abrupt termination of the process (e.g. Ctrl-C). The Virgo Nano distribution will have much faster startup time, probably memory optimisations too. Linking with the relevant enhancement. |