Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 316362

Summary: MPC considers all repositories when installing
Product: [Technology] MPC Reporter: Ian Bull <irbull>
Component: wizardAssignee: David Green <greensopinion>
Status: RESOLVED FIXED QA Contact:
Severity: blocker    
Priority: P1 CC: david_williams, greensopinion, ian.skerrett, jkrause, leftylynn, mknauer, steffen.pingel, stephan.herrmann
Version: unspecifiedFlags: irbull: review+
Target Milestone: 1.0.0   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
Patch v1
none
Non git format patch
none
mylyn/context/zip
none
Patch v2
none
mylyn/context/zip none

Description Ian Bull CLA 2010-06-09 15:22:43 EDT
When installing software through the MPC, the MPC consults all known p2 repositories.  To be specific, when you select a component from the market place, the MPC first computes the "URL" (update site) for this component, but then the install operation loads this URL plus ALL OTHER p2 repositories to determine the dependencies.  

For example, I have 10 repositories listed in my Available Software sites, and when I install a component through the market place all 10 repositories are first fetched and loaded.

This approach has a number of problems:

1. Performance.  The fact all repositories are first loaded means that all repositories must be reachable (or we timeout trying).  For example, at EclipseCon where we sometimes have a dedicated server, if we hosted a marketplace on that dedicated server it would not matter because clients would still attempt to reach the outside world for each install.  This also means that as clients start using both p2 and the MPC to install things, they will begin to face serious performance hits.

2. Authenticity of what I'm installing.  If I list a URL for my component at the market place, then I assume that this URL will be the one consulted when the component is installed. However, if the same component is listed in more than one p2 repository, then it's anybodies guess which repository will actually provide the component.
Comment 1 Ian Bull CLA 2010-06-09 16:00:12 EDT
To follow up, there are reasons why we should consult all repositories -- namely, if your repository does not provide all the needed dependencies, you 'cross your fingers' and hope that they can be found in the user's other repositories.  I don't think we should build solutions around this, but I'm sure some people have.

I think this problem can be solved by the Catalog providers.  If a catalog has all the required dependencies, then this should be communicated to the MPC and no additional repositories will be consulted.

There are actually two ways we can do this:

1. An MPC Catalog can list additional repositories that should be consulted when installing components.  For example, if a marketplace wanted to list the Helios repository as a reference, this could be done.  If a catalog is completely self contained, then it would simply list no additional repositories.

2. An MPC Catalog can declare itself as "self contained", otherwise all other repositories will be consulted.  This can be declared in the catalog extension point. 

In the long term I like solution #1, but I think that might be too late at this point.  I do think this problem is serious enough to consider for 3.6.  I prototyped option #2 because it's much less distributive than #1. I did this by adding an additional attribute on the Catalog extension point named "consult_all". By default, this value is TRUE, so all existing catalogs will continue to function the same way without modification. 

Any thoughts on this problem, or other solutions?
Comment 2 Ian Bull CLA 2010-06-09 18:07:15 EDT
Created attachment 171584 [details]
Patch v1

Here is a patch that adds the ability for a catalog to indicate that it's "self contained". If a catalog is self contained, then no additional repositories will be used when installing software. 

The default value is *false* (i.e. not self contained).  This is the current behaviour, so existing client are not required to do anything.

I have tested this both with clients that are self contained and those that are not.  I have also tested the Eclipse market place.
Comment 3 Ian Bull CLA 2010-06-09 18:36:28 EDT
Created attachment 171588 [details]
Non git format patch

From the Git wiki, I see that it's not possible to apply patches that are in 'git format'. This is the same patch as before, but not in 'git format'.
Comment 4 David Green CLA 2010-06-09 20:23:37 EDT
Thanks for the bug report and patches!

Regarding performance, agreed that things work a lot faster when only one repository is consulted.  While I'm not sure how this relates to the EcilpseCon scenario, the goal is for MPC and p2 to be a well-behaved solution that performs well in all scenarios.

Regarding authenticity, that should really be handled by the platform and bundle signing.  That's really the only way to be sure that they originate from a trusted source.  That said, have you looked at @org.eclipse.epp.internal.mpc.ui.operations.ProfileChangeOperationComputer.computeInstallableUnits(SubMonitor)@?  The IUs are first resolved against the repositories listed in the catalog for the selected solutions.  Only once the IUs are identified is the install operation resolved.  Correct me if I'm wrong, I was under the impression that would ensure that they're installed from the appropriate repositories.

In principal I agree that MPC can be improved in this area to provide a better experience, though perhaps we should consider some alternate solutions.  For instance, we could default to consulting only the selected repository, which could be a composite repository to pull in any dependencies.  Alternatively, we could choose to resolve first against only the selected repository, and only if that fails do we consult all repositories.  These are just two that come to mind, I'm sure that there are other alternatives as well (for example, solution providers could have the option of listing multiple update sites with their solution).

As this bug is not critical in nature, it's too late to make changes of this nature for Helios, however given enough input and testing we could try for the next service release.
Comment 5 David Green CLA 2010-06-09 20:23:40 EDT
Created attachment 171595 [details]
mylyn/context/zip
Comment 6 Ian Bull CLA 2010-06-09 22:36:59 EDT
Yes, I saw the code in ProfileChangeOperationComputer, and it's doing exactly what you describe (and what I would expect). That is, it's computing what repositories are needed for the install (and what IU's are also needed).  This list of repositories is then passed to the resolveInstall method. However, this list is never consulted. In particular, this list is not passed to the ProvisioningContext and according to p2's API, if no repositories are listed on the ProvisioningContext, then ALL repositories shall be used.  

And this really does mean ALL -- that is, all repositories that have ever been used: from other marketplaces, from the p2 UI, from the director app, from previous installs, etc...  As I mentioned above, I have over 10 repositories in my system (many of them composite -- repos that point to other repos) and I'm seeing load times of over 60 seconds (that's before we even start to download the artifacts). While I'm likely an anomaly (since I'm on the p2-dev team and I end up with a lot of test repos), if the marketplace really does take off the way we hope it does, then a dozen repositories might not be uncommon.

The p2 team identified this problem early on, and even added a check box on the installation page "Contact all update sites during install to find required software".  I don't think we should add this checkbox to the MPC as it will take away from the simplicity of the UI, which is why I think we should put the onus on the catalog.

I like some of your other solutions, especially the use of a composite repositories to list the dependencies.  I was trying to arrive at a solution that would not require any changes to existing infrastructure / markets in the short term -- but still provide catalogs the ability to isolate themselves from one another. In the long term I would love to help put together a scalable solution to this problem.
Comment 7 Jochen Krause CLA 2010-06-10 00:57:39 EDT
If the (In reply to comment #4)
> 
> Regarding authenticity, that should really be handled by the platform and
> bundle signing.  That's really the only way to be sure that they originate from
> a trusted source.  That said, have you looked at
> @org.eclipse.epp.internal.mpc.ui.operations.ProfileChangeOperationComputer.computeInstallableUnits(SubMonitor)@?
>  The IUs are first resolved against the repositories listed in the catalog for
> the selected solutions.  Only once the IUs are identified is the install
> operation resolved.  Correct me if I'm wrong, I was under the impression that
> would ensure that they're installed from the appropriate repositories.
> 

If the desired behavour was to use only the repositories listed in the catalog then the implementation does not achieve this (as explained in comment #5). A really simple solution would be to pass the list of computed repositories to the provisioning context in the respective three methods (install, update, uninstall) in ProfileChangeOperationComputer. You are passing the repositories already into the methods, but then never use it. The fix is a single line addition per method:
		operation.getProvisioningContext().setMetadataRepositories(repositories);

This bug is critical to us and from my point of view also a blocker to every provider of solutions on any of the marketplaces.
Comment 8 David Green CLA 2010-06-10 10:14:35 EDT
(In reply to comment #6)
> Yes, I saw the code in ProfileChangeOperationComputer, and it's doing exactly
> what you describe (and what I would expect). That is, it's computing what
> repositories are needed for the install (and what IU's are also needed).  This
> list of repositories is then passed to the resolveInstall method. However, this
> list is never consulted.

We were passing in the list of repositories however this was changed due to bug 306657

> , if the marketplace really does take off the
> way we hope it does, then a dozen repositories might not be uncommon.

It's not unusual for a few uses of MPC to cause the addition of several update sites.  Where users may not have installed so many features before, MPC makes it so easy that I expect people will install more things, if only to try them out.  The resulting number of update sites used will likely increase over a typical pre-MPC installation.

> The p2 team identified this problem early on, and even added a check box on the
> installation page "Contact all update sites during install to find required
> software".  I don't think we should add this checkbox to the MPC as it will take
> away from the simplicity of the UI, which is why I think we should put the onus
> on the catalog.

I agree, simpler is better.

> I like some of your other solutions, especially the use of a composite
> repositories to list the dependencies.  I was trying to arrive at a solution
> that would not require any changes to existing infrastructure / markets in the
> short term -- but still provide catalogs the ability to isolate themselves from
> one another.  In the long term I would love to help put together a scalable
> solution to this problem.

That makes sense to me.  I like how your solution is isolated to the extension point and a few classes, that minimizes the impact.  I don't understand why this policy should apply to a whole catalog.  That said, the changes are small and unlikely to cause problems.

(In reply to comment #7)
> This bug is critical to us and from my point of view also a blocker to every
> provider of solutions on any of the marketplaces.

After consulting the "Helios Endgame Plan":http://www.eclipse.org/eclipse/development/plans/freeze_plan_3.6.php it seems there is some room to get a fix such as this into the build, however it requires PMC approval since it changes API (the extension point).  If I'm to support taking an issue like this to the PMC I'd like to have a very good understanding of why it's so important to have fixed.   Considering that MPC has the same behaviour as the default behaviour of the p2 installation UI, perhaps you can explain why this bug should be considered critical or blocker?
Comment 9 Jochen Krause CLA 2010-06-10 14:26:27 EDT
(In reply to comment #8)

I will try to explain why this is a very serious issue. Before the MPC was around, the selection of "available sites" was an active decision of the user, maybe except the seeding of release train sites from epp.

With the MPC sites get added to the list of available sites without the user even noticing. Just try it yourself, and open the Yoxos marketplace and go the the "popular" tab. That is everything you have to do to get 3-10 new sites added to the sites p2 is considering. In combination with considering all sites at install this becomes a real issue, because the user is not aware of this.

From my point of view this opens a wide door for malicous use of any marketplace, which is enough to make it a blocker that should be resolved for helios. Why? Because during the install it is not determined which repository will be providing the bytes. I don't want to elaborate on this in the public, if you are not convinced that this is an issue I will open a bug that is not publicly visible.

But there is another angle to this. If you install something from a marketplace, you consider that a contract between you, the marketplace provider and the component provider. This is not guaranteed in the current implementation, it is simply not deterministic. 

The patch which Ian Bull has provided does not really completely solve the issues raised above, thats why I suggest to alter it in the following way: 

A marketplace can define one repository that it is providing as a "standard" along the repositories of the solution providers. This repo could be a composite repo and could for example point to the helios repo(s) for the Eclipse marketplace. This composite repo can be changed by the marketplace provider to include more / newer / other components at any point in time. When the mpc client is doing the resolving and installing, it will consider the  solution provider repository and the marketplace "standard" repository.

This way we get to a solution that avoids the security problems with the current implementation for the eclipse marketplace and is still relatively low impact.
Comment 10 Ian Bull CLA 2010-06-10 14:40:34 EDT
The more I think about this problem, the more I realize just how much of a security hole we have here.  As Jochen mentioned, in the p2 UI you *explicitly* add software repositories, in the MPC you trust the market place to add these for you.  

Even if the different marketplaces vetted all the software they listed, folks are able to change their content (since it's not all hosted at Eclipse.org). Simply by browsing a component, that software site could be added and used during a future install.

My proposed patch allowed other market places to secure themselves, but it still left a wide open hole in the default Eclipse Marketplace.  To address this, I have put another patch together following Jochen's suggestion.  Using Jochen's suggestion, each marketplace can specify *trusted* dependencies.  For the eclipse marketplace, the trusted dependency could be the Helios repo -- or more likely, a composite repository that points to Helios.

I will upload the patch shortly.
Comment 11 Ian Bull CLA 2010-06-10 15:25:07 EDT
Created attachment 171665 [details]
Patch v2

This patch uses the 'trusted dependencies' idea.  For the Eclipse Marketplace I have listed http://download.eclipse.org/releases/helios as the trusted dependency. While this works, I suggest we create a composite repository (say http://market.eclipse.org/dependencies) and list helios there.  As we start to develop Indigo, we can add that to the 'trusted list' -- assuming we trust the content in Indigo ;).

The patch is in non-git format. (I'm still not sure what that means).
Comment 12 David Green CLA 2010-06-10 16:14:43 EDT
(In reply to comment #9)
> With the MPC sites get added to the list of available sites without the user
> even noticing. Just try it yourself, and open the Yoxos marketplace and go the
> the "popular" tab. That is everything you have to do to get 3-10 new sites added
> to the sites p2 is considering. In combination with considering all sites at
> install this becomes a real issue, because the user is not aware of this.

I can see how this is a very serious problem.  Adding update sites without any action on part of the user should never happen.  In addition, the UI should show _which_ update sites are added, which it currently does not.

I've raised this issue to the attention of the project lead Ian Skerrett.
Comment 13 Markus Knauer CLA 2010-06-10 16:45:49 EDT
I reviewed the patch a bit and found a minor issue

* CatalogDescriptor.equals() - missing new attribute additionalRepository

But more important is that the changes in ProfileChangeOperationComputer.resolveInstall() look really good!
Comment 14 Jochen Krause CLA 2010-06-10 17:00:02 EDT
I did some testing with the proposed patch v2 and it works well with both the eclipse marketplace and the yoxos marketplace. I tried installing the Object Teams Development Tooling (OTDT) that has initially led to considering all sites (https://bugs.eclipse.org/bugs/show_bug.cgi?id=306657) and it installed just fine.
Comment 15 Ian Bull CLA 2010-06-10 17:58:20 EDT
(In reply to comment #13)
> I reviewed the patch a bit and found a minor issue
> 
> * CatalogDescriptor.equals() - missing new attribute additionalRepository
> 

This is a good point Markus. However, the CatalogDescriptor only uses the URL for identity (hashcode and equals). It doesn't make use of any of it's other fields (name, description, etc..) when computing identity, so I don't think it makes sense for it to use this field either.
Comment 16 Ian Bull CLA 2010-06-10 18:23:13 EDT
I also did a bunch of testing with this patch and bug 306657 (the bug which original introduced the undesired behaviour) is a great example of the problem at hand. In that bug, the Object Teams plugin is assuming the existence of features that I currently don't have installed.  By looking through all known repositories, the *hope* is that the needed software can be found.

However, Object Teams is also "patching" the JDT with unsigned jars. I get a small warning (the type of warning most users will simply ignore) that it's doing this.  But here's the question: do I trust all those unsigned jars, and more importantly, do I trust where they came from?  If I trust the people behind Object Teams (and I trust the marketplace -- i.e. the Eclipse foundation in this case) then this should be enough. However, if I can't be sure that *only* those software repositories are being consulted, then that trust is broken.

In essence, this patch (v2) is making things explicit.  It's saying, we only use what the component provider gives us + what we (the marketplace provider) trust.  

So in short, Bug 306657 is still fixed by this patch, but now I know exactly where my unsigned jars and their dependencies, are coming from. Everything is coming from either the Object Team repository OR a repository that the marketplace trusts -- and that's a manageable risk.
Comment 17 David Green CLA 2010-06-10 21:13:58 EDT
Thanks for all of your hard work on this issue.  I'll take a look at your latest patch shortly.

I've committed a fix for most of the issues involved based on the selfContained concept in your first patch.  While I realize that this doesn't solve the dependencies problem of your most recent patch, it does fix most of the issues involved.  I encourage you to take a look at revision fef03a6b8367716a6621c25cde2d3785adf90752 as soon as you can and provide feedback.  A build including these changes is available from here: https://build.eclipse.org/hudson/job/epp-mpc-nightly/lastSuccessfulBuild/artifact/site.p2/

The issues addressed in this fix:
* The feature confirmation page of the wizard now shows repository URLs so that the user has some way of correlating selected solutions with URLs.
* Repository URLs are only added to the user's configuration if they select items for installation and press 'Finish' on the wizard.
* Items are not marked as installed unless the repository URL of the catalog item already exists in the user's configuration.

I plan to take a look at your most recent patch and apply the dependencies concept as well since it makes a lot of sense.

At this late stage the more feedback we can get the better.
Comment 18 David Green CLA 2010-06-10 21:14:01 EDT
Created attachment 171685 [details]
mylyn/context/zip
Comment 19 David Green CLA 2010-06-10 22:37:41 EDT
As of 03d15d8fa88742c72a558f2d8b705314c47d66bc this bug should be fixed.  I'd like to have a review by interested parties to verify that no regressions have been introduced.  

Please install the latest build and do as much testing as you can.  You can get the latest from here: https://build.eclipse.org/hudson/job/epp-mpc-nightly/lastSuccessfulBuild/artifact/site.p2/
Feel free to do a code review as well to be sure that nothing was missed.

This fix is time sensitive: if we can get enough coverage by mid-day on the 7th (tomorrow) then we can put it forward for inclusion in Helios.
Comment 20 Ian Bull CLA 2010-06-11 02:03:13 EDT
Thanks everyone (Especially David Green) for all the hard work on this bug. I worked with David this evening to review the code and then I did some testing myself. I tested both the Eclipse Marketplace and the Yoxos one, and things look much better.

I also did some testing with a debugger attached, and watched the p2 repositories that were consulted during the provisioning operations.  With the latest code, the MPC only considers the specified (trusted) repositories.
Comment 21 Jochen Krause CLA 2010-06-11 02:59:17 EDT
I have tested the new version (https://build.eclipse.org/hudson/job/epp-mpc-nightly/lastSuccessfulBuild/artifact/site.p2/) on MacOSX and can also +1 on the code released by David. Thanks for reacting to fast on this issue!

My test scenarios included installing various solutions from both marketplaces in "fresh" RC3 installs with an updated marketplace from the url mentioned above.
Comment 22 Ian Skerrett CLA 2010-06-11 08:08:42 EDT
(In reply to comment #21)
> I have tested the new version
> (https://build.eclipse.org/hudson/job/epp-mpc-nightly/lastSuccessfulBuild/artifact/site.p2/)
> on MacOSX and can also +1 on the code released by David. Thanks for reacting to
> fast on this issue!
> 
> My test scenarios included installing various solutions from both marketplaces
> in "fresh" RC3 installs with an updated marketplace from the url mentioned
> above.

@jochen and @ianbull thanks for you help on this.  Can you please let me know what plugins you have test installed?   We have done significant testing on the different plugins on Eclipse MP and we will need to retest some of these listings.  It would be helpful if you can let us know what you have tested, so we don't duplicate.
Comment 23 Jochen Krause CLA 2010-06-11 09:46:17 EDT
From the Eclipse Marketplace I installed Findbugs, GlassFish Server Open Source Edition Tools, Industrial SQL Connector for Mylyn and Object Teams Development Tooling (OTDT).

The Glassfish Tools relied on a lot of Webtools plugins that we not available in my installation and installed just fine.
Comment 24 Ian Bull CLA 2010-06-11 09:56:50 EDT
I installed a number of the Instantiations tools. I can do more testing today and give you a complete list of what I tried.
Comment 25 Lynn Gayowski CLA 2010-06-11 11:08:34 EDT
I used the Pulsar package to retest 28 plugins that were previously working with the M7 release for Java Developers package and all were successful:
Ant Utility
AnyEdit Tools
Atlassian Connector for Eclipse
Bytecode Outline
CodePro AnalytiX
CodePro Profiler
CollabNet Desktop - Eclipse Edition
Data Hierarchy
Ecalculator
Eclipse Multi-Touch
eDependency
Epsilon
Excelsior JET
Extended VS Presentation
FileSync
FindBugs Eclipse Plugin
GWT Designer (initially failed, but successful on 2nd try)
InstaSearch
JDepend4Eclipse
MyEclipse Enterprise Workbench
Open with Eclipse
Pydev
RSSOwl
SAP NetWeaver Server Adapter
Subclipse
Subversive
Tasktop Pro
Xtext
Comment 26 David Green CLA 2010-06-11 11:20:04 EDT
I've labeled the latest MPC build *Helios RC5* which was built from revision 64de085c53a1d8a7ef2bb7a0e9db3c57157cfd2f on June 11 at aproximately 12:30 am EST.   https://build.eclipse.org/hudson/job/epp-mpc-nightly/81/

This build resolves this issue.
+1 for including this build in Helios

Many thanks to everyone for helping with this one.  Special thanks to Ian Bull who was very helpful with patches, feedback, and collaboration in a code review and testing.
Comment 27 Stephan Herrmann CLA 2010-06-11 16:24:03 EDT
I'm late to this party, but as the reporter of bug 306657, which led to
all this, I'd like to thank you for your hard work in finding a solution 
that doesn't break bug 306657!

I tried the build mentioned in comment 26 and I do like what I see:
the explicit mentioning of the repository from which the feature will
be installed.

I'd like to add one more suggestion for even more transparency:
Shouldn't the user also have an opportunity to investigate the dependencies
being pulled in implicitly? I'm thinking of a notice:
 "This operation will also install software from these additional sites:
   + repo.url.1
   + repo.url.2"
Ideally, those items could be expanded to show those actual artifacts.
If these are bundle dependencies this would be more fine grained than the
other parts of the MPC/P2 UI, but I believe the user should be able to
obtain this information upon request ("WHY does install want to fetch
things from those sites??").

I like the simplicity of the MPC but I'm also a great fan of transparency.

Is s.t. like this
 - already planned for p2?
 - a bad idea to begin with?
 - candidate for a follow-up bug?
Comment 28 Stephan Herrmann CLA 2010-06-13 12:58:06 EDT
FYI: I just filed bug 316702 against p2, proposing a more general approach 
to involving the user in security decisions - post-Helios of course :).
Comment 29 David Green CLA 2010-06-14 10:29:38 EDT
(In reply to comment #28)
> FYI: I just filed bug 316702 against p2, proposing a more general approach
> to involving the user in security decisions - post-Helios of course :).

Great, that's the right place for it.  If/when bug 316702 is resolved, feel free to open a new bug with MPC to integrate any of p2's changes with MPC.