| Summary: | Startup takes 20 times longer with P2 enabled | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | David Williams <david_williams> | ||||||
| Component: | p2 | Assignee: | P2 Inbox <equinox.p2-inbox> | ||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||
| Severity: | blocker | ||||||||
| Priority: | P3 | CC: | david_williams, deboer, jeffmcaffer, philippe_mulet, simon_kaegi | ||||||
| Version: | unspecified | ||||||||
| Target Milestone: | 3.4 M6 | ||||||||
| Hardware: | PC | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Bug Depends on: | |||||||||
| Bug Blocks: | 223238 | ||||||||
| Attachments: |
|
||||||||
|
Description
David Williams
I take that back about not being such a difference on linux x86 ... I hadn't moved up to most recent P2 SDK on that box. Once I moved up, the slowdown there was significant also. If not obvious, we run our JUnit's with WTP and all our pre-reqs extracted to 'dropins' folder. Anything simple we can do on our end to speed this up? Remove dependancy versions from features? I realize that if we install (once) with P2 operations, we'd not have this problem ... but ... am hoping for some quick and easy way to get through this week's JUnit tests (which is needed on Thursday for our weekly declared build). This is largely due to support we added this week for performing "best-effort" installation from the dropins folder, with resilience to unsatisfied dependencies, multiple versions, etc. We haven't had time to optimize this yet (bug 224270). Note that it is only the first startup after adding content to dropins that is slow, but unfortunately when running tests you typically add new things to dropins for each test run so this isn't much reassurance. (In reply to comment #4) > Note that it is only the first startup after adding content to dropins that is > slow, but unfortunately when running tests you typically add new things to > dropins for each test run so this isn't much reassurance. > Yes, actually we don't ... we dump it all and then run through all the tests (we've always intended to break them up, make them more precise, but haven't yet). But, we do run each test in a new workspace, if that matters. Ok, that offers some hope - there may be an optimization we're missing and it's doing too much work on subsequent starts. With the Eclipse platform test framework each set of tests is dropped into a clean, new Eclipse install to avoid the possibility of old tests contaminating future runs. (In reply to comment #6) > Ok, that offers some hope - there may be an optimization we're missing and it's > doing too much work on subsequent starts. With the Eclipse platform test > framework each set of tests is dropped into a clean, new Eclipse install to > avoid the possibility of old tests contaminating future runs. > I just confirmed, the very first test that runs has roughly same numbers as subsequent runs. So ... either I'm completely mistaken about how our tests run, or the basic resolution info isn't being cached. (or, is being cached in the workspace, which wouldn't seem right). First test (releng Test Build): Time to load bundles: 4703 Starting application: 288143 Pre P2 Time to load bundles: 105 Starting application: 215367 So ... think if I follow the instructions at http://wiki.eclipse.org/Equinox_p2_reverting_to_UM (first three steps) that the startup time will also revert to previous levels? Just another thought ... I'm sure you've already thought of it, but just in case your heads are spinning from lack of sleep :) ... I wonder if the increased time could be related to re-reading the manifests a lot, which are now signed, and hence take a long time to read? If so, some "manifest cache" might help. Created attachment 93848 [details]
patch
This patch should reduce the impact on startup time for the second and later runs. In this patch we check the repository timestamps and profiles timestamps to see if we have to re-synch the profile in the ProfileSynchronizer. This lets us avoid the plan creation and executions steps for the typical re-use use-cases.
I'll attach the replacement bundles to go along with the I20080327-0100 build.
Created attachment 93849 [details]
zip of replacement bundles
I tried this fix in my local system, and the startup was substantially faster, 10 times faster, roughly. before fix Time to load bundles: 1562 Starting application: 331938 Application Started: 369172 after fix Time to load bundles: 125 Starting application: 17797 Application Started: 37203 Which makes me think our Junit's will be able to complete in a reasonable time. But, I won't know for sure until late tonight or early AM. I've released the patch to HEAD. We can leave this bug open until we get back the result of your (DW) tests. Tests will run soon, but wanted to change severity, since this bug's "blocks" bug was closed. I consider this bug as blocking, since obviously if we can't complete our JUnit tests, we couldn't declare a milestone ... and, we never did get them to run all the way through with the unfixed code. (Not to mention what a pain it is to wait 360 seconds for my workspace to come up, instead of 30 seconds!). I'm optimistic, but ... just wanted to do this bookkeeping. Also, I've noticed, and IM'd briefly with Simon about ... if -clean is specified, the extra very very long start up time happens again. Obviously, we'd want some way to "clean" (reset) this cached timestamp data, but I wonder if it should be some other parameter, in order to maintain similar backwards compatible behavior. As it it ... I've had to remove -clean from all our JUnit test scripts (and am hoping it doesn't introduce a lot of errors :( Should these be a new bug? Or, can it be consider part of this same issue? Applying these patches (along with the patch in bug 224289) did allow our JUnit tests to complete, so I consider this bug fixed ... at least, the blocking nature of the slow down. It is still slower, even after I removed -osgi.clean=true (which the comparison runs below did not have ... we used to always use osgi.clean-true, to minimize influence between tests. To give some very rough data, before P2, our tests took about 73 minutes. With P2, the total time was about 99 minutes. So ... I'm guessing your startup performance tests, even with this fix, with show about a 35% regression in startup time. But ... at least that's containable and allows us to run. Thanks very much for this fix! (In reply to comment #14) > ... so I consider this bug fixed ... > Just to be explicit ... I mean fixed, as long as it gets in M6 :) Thanks David! We should be able to find more opportunities for improvement in M7. Our fixes were released for the build last night. I have opened bug 224579 to track further performance enhancements in M7 on first startup after installing new bundles. (In reply to comment #16) > Thanks David! We should be able to find more opportunities for improvement in > M7. Our fixes were released for the build last night. > I assume that's the I20080327-2251 build? We'll try that, since we need a fix for 224423 anyway. |