Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 311929 - Very slow, seems to be repeating operations when trying to install new software
Summary: Very slow, seems to be repeating operations when trying to install new software
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.6 RC1   Edit
Assignee: Susan McCourt CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 311776 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-05-06 12:38 EDT by Walter Harley CLA
Modified: 2010-05-07 17:00 EDT (History)
3 users (show)

See Also:
pascal: review+


Attachments
patch to ProvisioningContext (3.67 KB, patch)
2010-05-06 19:33 EDT, Susan McCourt CLA
no flags Details | Diff
repo.xml (5.78 KB, application/xml)
2010-05-07 16:58 EDT, Andrew Niefer CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Walter Harley CLA 2010-05-06 12:38:27 EDT
I installed Eclipse 3.7M7 from scratch.  Added http://download.eclipse.org/eclipse/updates/3.6-I-builds update site, selected "Releng Tools", and asked to install.

The operation then took almost 10 minutes to complete, almost all of it in "Calculating Requirements and Dependencies".  It seems to be cycling through a bunch of the same TPTP plugins over and over for about 8 minutes of this - it's hard to tell because they flick by fairly fast and the names are long, but the plugin names were definitely repeating, can't be sure about the version numbers.  Note that I do not have TPTP installed, just JDT and PDE.

The error log is empty.
Comment 1 Susan McCourt CLA 2010-05-06 13:05:21 EDT
do you have the 
[ ] Contact all update sites to find requirements 
box checked?

If so, then you are loading the Helios site for the first time, which would download the jar files that represent the child sites, and I believe that for old-style (site.xml) sites this also involves downloading the actual features so that the metadata can be generated.
Comment 2 Pascal Rapicault CLA 2010-05-06 13:19:04 EDT
Setting as 3.6 to keep on the radar.
Comment 3 Pascal Rapicault CLA 2010-05-06 13:24:53 EDT
*** Bug 311776 has been marked as a duplicate of this bug. ***
Comment 4 Susan McCourt CLA 2010-05-06 13:33:41 EDT
Looking at the screenshot in bug 311917, we see a modelling artifacts.jar being downloaded.  The site being loaded is Helios.  
[x] Contact all sites
is also checked so any other repos being defined are being loaded there.

BUT...thinking more...
Why are we seeing the artifacts.jar being downloaded?
"Normally" we don't load these until the collect phase.
However, we have added artifact repo reference following to the provisioning context, and if the Helios metadata repo has a reference to the Helios artifact repo, then this artifact repo is being found and loaded during resolve.  (If there is not a reference, this repo won't be loaded until the collect phase).

The reason the artifact references are being loaded during resolution is that as far as the ProvisioningContext life-cycle goes, there is an undefined window of time between resolution and the actual collection of artifacts.  By loading and keeping the actual artifact repos in the provisioning context, we have access to them during the collect phase without influencing or changing the state of the repo manager, which could have changed.  For example, we don't want to reenable some artifact repo that got disabled since resolution, but we would want to consult it if it was there during resolution.

So this loading is the "safest" way to preserve the correct state in the
provisioning context, but in practice it is placing new burden on the resolve
phase.  We could consider simply keeping the artifact repo reference URIs and
loading them at the time the artifacts are requested.  The downside here is
that this loading them later will add the repo URI to the manager, so we would want to consult the old state, load it, then restore the state, etc.  I had hoped to avoid this kind of knowledge inside ProvisioningContext, but perhaps it's necessary, and in fact more appropriate for the UI-driven use case, where it is unlikely that the state in the manager changed while the wizard was up.

Pascal, what do you think?
Comment 5 Walter Harley CLA 2010-05-06 14:03:06 EDT
I don't remember whether "Contact all update sites to find requirements 
box checked?" was checked - I didn't change any settings after my fresh 3.6M7 install, and I don't remember having changed that in the past but it's an old workspace.

I definitely saw artifacts.jar file(s) being downloaded, but that was after the initial delay while checking TPTP stuff.
Comment 6 Susan McCourt CLA 2010-05-06 14:47:28 EDT
I can't speak to the non-artifacts.jar related issues.

However, we do have a problem here with downloading artifacts.jar, and Pascal informs me that we do not cache these, so we need to fix this.

The risk we were trying to avoid when we decided to load the artifact repos at resolve time was something like this:

- artifact ref is enabled at time of resolve (but not loaded)
- at collect time, the ref is no longer enabled
- provisioning context loads the URI (which adds it to the manager)
- we don't want to enable something in the manager that was not enabled before
- so we have provisioning context try to be smart and do something like:  "check manager, load repo, restore repo to manager state" 
- so it's possible to get something like:

oldState = disabled or not present
load Repo (which adds and enables it)
<----------USER (OR SOME OTHER THREAD) ENABLES IT DURING THIS TIME---------->
set state back to disabled

Now we have the infamous "lost artifact repository" problem where the artifact repo is disabled even though the metadata repo (and the user view of the world) is that it is enabled.

Note that loading the non-cached repo takes some time, so the window in which some other thread could change the state is quite long.  The possibility of this in the SDK workflows is unlikely, but theoretically possible if the user had pressed finish, disabled some site before its artifacts.jar was touched, and then enabled it again while the artifacts.jar was being downloaded by the install operation. 

A more paranoid approach could be to load the artifact repo when needed without regard for its state in the manager.  If it was enabled during resolve, it's going to get enabled again.  This would never cause "lost artifact repositories"  but could theoretically cause unnecessary downloads of artifacts.jars during installation/update operations.
Comment 7 Susan McCourt CLA 2010-05-06 19:33:28 EDT
Created attachment 167414 [details]
patch to ProvisioningContext

this patch moves the artifact repo loading to occur when the artifacts are actually requested by client rather than during reference following.  When the reference is found, the location is remembered rather than the loaded repo.
Comment 8 Susan McCourt CLA 2010-05-06 19:39:35 EDT
Pascal, can you review this patch?
This adopts the approach we discussed whereby we defer loading of artifact repos until artifacts (or keys, or repos) are requested.  The downside is that if a user disables a repo between the time that the metadata is retrieved and the artifacts are requested, the artifact repo will be reenabled.  As we discussed on IRC, this seems better than accidentally disabling an artifact repo.

I tested that the repo reference following still works (bug 278191).

And in theory I can understand why there could have been repeated downloads of the same file.  The old code remembered the referenced artifact repos, and added them to the specified list of artifact repos, without removing duplicates.  Since artifact repos aren't cached, it's possible that we'd see the same artifact repo downloaded twice - once for the reference in the provisioning context, and again for the reference from a metadata repository.

What bothers me is I don't see the original problem as reported, so I can't prove that it's gone.  I think there must be a set of repos that's worse than the others.  I tried clearing out my caches and installing from the I-build site, with Helios also enabled, with [x] contact all sites checked, and while I do see the metadata coming down, I don't see artifacts being downloaded in either case.
Comment 9 Susan McCourt CLA 2010-05-06 20:13:58 EDT
I've tried a number of different combinations of sites and installs, and I'm just not seeing the behavior as reported...although I can understand looking at the code how it could happen.  Until I can make it happen, I can't prove that this fix stops it.
Comment 10 Pascal Rapicault CLA 2010-05-06 21:21:08 EDT
Fixed released. Awesome turn around!
Comment 11 Andrew Niefer CLA 2010-05-07 09:46:48 EDT
I think defering loading of the artifact repos is good because at that point the operation has generally been moved to the background so the user isn't blocked by it in the same way as when it happens during dependency resolution.

I'll try to reproduce the "tptp problem" with this fix since I ran into it earlier.
Comment 12 Andrew Niefer CLA 2010-05-07 16:35:59 EDT
This seems better now.  I had forgotton to check this but go reminded when I was trying to install the releng tools and it started downloading jars.   I exported org.eclipse.equinox.p2.engine (with the changes) into my host and restarted.  When I tried again to install the releng tools things worked much better.

I also installed C/C++ Library API Documentation from Helios without problems
Comment 13 Susan McCourt CLA 2010-05-07 16:49:29 EDT
thanks, Andrew.
Just for curiosity's sake, could you export your software sites from the available sites pref page and attach the xml file here?  I'm curious why I never observed this.  I installed releng tools and also Eclipse C/C++ (since that was pictured in one of the screenshots showing the modelling jars downloaded), and never saw any artifacts.jar downloading.  So I'm wondering if it's some other site (which would make sense if [x] contact all sites was checked)
Comment 14 Andrew Niefer CLA 2010-05-07 16:58:35 EDT
Created attachment 167578 [details]
repo.xml

Here is my repo list.  It is essentially just Helios, together with the big list that comes out of that.

(Also the I-Builds repo from the build machine.)
Comment 15 Pascal Rapicault CLA 2010-05-07 17:00:57 EDT
Thx for confirming Andrew!