Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 216934

Summary: [plan] Performance issues with the Export-Package "uses" directive
Product: [Eclipse Project] Equinox Reporter: Thomas Watson <tjwatson>
Component: FrameworkAssignee: equinox.framework-inbox <equinox.framework-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P4 CC: barthel, d.nachev, david_williams, ekke, give.a.damus, gunnar, jacek.pospychala, jens.muehlenhoff, martin.skorsky, pascal, simon_kaegi, slewis, slubicki, stefan.roeck, wtracey
Version: 3.4Keywords: plan
Target Milestone: Mars   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 275572    
Attachments:
Description Flags
patch
none
Bundle manifest files none

Description Thomas Watson CLA 2008-01-29 11:41:08 EST
The algorithm for finding the best combination of constraint solutions which has the least amount of class space inconsistencies has performance issues as the solution set grows.  This usually involves 1000s of bundles which many bundles exporting the same package.

This bug is to track ideas for how to solve this problem which is possibly NP-Complete or NP-Hard (hopefully NP-Hard).

See http://en.wikipedia.org/wiki/NP-complete and http://en.wikipedia.org/wiki/NP-hard

The current approach in the resolver finds all the possible suppliers for each constraint (Import-Package and Require-Bundle) and then simply iterates over all possible combinations until either a solution with no "uses" conflicts is found or all combinations have been tried.  If no combination exists which provides a solution with no conflicts then the combination which contains the least amount of conflicts is chosen.

The problem with this approach is the number of combinations can quickly grow to astronomical numbers if there are large sets of bundles installed.  Several attempts have been made to reduce the set of combinations to iterate (see bug 181327), but this still is not enough.  Another approach needs to be investigated to solve this problem.  It has been suggested that a SAT solver may be able to help here.

At this point we do not have a clear idea of how to solve this problem.  As a result it is recommended that the community does not specify "uses" constraints in their Export-Package statements.  For 3.4 we will not be able to spend the time to rewrite the resolver to solve this problem.  But we should do the following for 3.4:

1) Provide more debug trace options that can be enabled to debug the "uses" performance problem has been hit.
2) Consider adding a configuration option to either disable the "uses" directive or to simply take the first combination and unresolve any conflicting bundles.
3) Consider adding a monitor thread that can timeout the resolve operation and force it to take the best combination found so far.
Comment 1 BJ Hargrave CLA 2008-01-29 12:32:36 EST
(In reply to comment #0)
> 3) Consider adding a monitor thread that can timeout the resolve operation and
> force it to take the best combination found so far.
> 

A variant of this would be a way to specify a time bound on the resolver computation. The resolver could periodically check whether it has exceeded the time bound and take the best available solution once the time bound had been exceeded. The time bound could either be a fixed amount of time (e.g. 30 sec) or some function of the number of bundles in the algorithm (e.g. f(N,T)=N*T when N is the number of bundles and T is the specified time bound factor. If N=1000 and T=30ms then f(N,T)=30 sec).
Comment 2 Thomas Watson CLA 2008-01-29 12:48:54 EST
(In reply to comment #1)
Thanks BJ, yes that is a better approach to implementing a timeout and should be the approach taken if we decide to implement a timeout.
Comment 3 Thomas Watson CLA 2008-01-29 14:01:07 EST
Created attachment 88173 [details]
patch

Patch that implements a timeout.  The formula used is 

min(B + (N*30), M)

Where B is the base timeout (set to 30000 ms), N is the number of bundles and M is the ceiling timeout allowed (set to 90000 ms).
Comment 4 Thomas Watson CLA 2008-01-30 09:53:01 EST
I opened a separate bug 217094 to release the resolver timeout.  I want to use this bug to track ideas of how to actually solve this problem in a performant way.
Comment 5 Thomas Watson CLA 2008-01-31 14:22:31 EST
I opened bug 217317 to consider adding an option to disable uses class space consistency checks.  I also released additional tracing code to help debug the uses consistency code.
Comment 6 Stefan Röck CLA 2009-03-10 17:29:21 EDT
Has there been any progress on this issue? AFAIK p2 uses SAT solver technology, too. 
One problem with the time-out is that it could lead to undeterministic behaviour, especially in scenarios with some hundrets of bundles. 
Comment 7 Pascal Rapicault CLA 2009-03-10 20:48:00 EDT
Stefan, on the biggest configuration I have seen (more than 2000 bundles in the final solution) the time spent in the solver has always been minimal and I have never seen it timing out. what takes the most time in p2 is the creation of the input to the solver.
Wrt to the use-clause the challenge is now to map the construct to a boolean formula.
Comment 8 Stefan Röck CLA 2009-03-11 04:31:10 EDT
Pascal, we have a target platform of 200 bundles. A lot of them are from Spring Bundle repository which heavily use "uses"-declarations.
The startup time is sth above 1 minute with the default "uses"-resolver. In this case, always a package-uses-conflict is displayed (however, the conrete error varies). 

However, if I use -Dosgi.resolver.usesMode=ignore, startup time is below 5 seconds and no package-uses-conflict is displayed (of course). Though this is no prove, i suppose that the time-out leads to the indeterministic behaviour. What do you think?
Comment 9 Thomas Watson CLA 2009-03-11 10:01:10 EDT
(In reply to comment #8)
> Pascal, we have a target platform of 200 bundles. A lot of them are from Spring
> Bundle repository which heavily use "uses"-declarations.
> The startup time is sth above 1 minute with the default "uses"-resolver. In
> this case, always a package-uses-conflict is displayed (however, the conrete
> error varies). 
> 
> However, if I use -Dosgi.resolver.usesMode=ignore, startup time is below 5
> seconds and no package-uses-conflict is displayed (of course). Though this is
> no prove, i suppose that the time-out leads to the indeterministic behaviour.
> What do you think?
> 

Yes this would indicate that we are hitting a timeout trying to find a good solution.  It is also possible that there is no perfect solution to your bundle set.

Unfortunately we have not made any more progress in solving this issue this release.  As pascal says, more investigation is needed to uses construct onto a boolean formula.  Then we need to get the smallest possible implementation of a pseudo boolean solver into the framework.

It would really help if you could attach the bundle manifests of the 200 bundles involved.  If you do not want to attach them here perhaps you could send me the manifests.
Comment 10 Stefan Röck CLA 2009-03-12 04:44:26 EDT
Created attachment 128528 [details]
Bundle manifest files

Thomas, I attached a subset of the involved bundle manifests. Using them, the startup time is about half minute with standard "uses" resolving and less than 5 seconds with osgi.resolver.usesMode=ignore.

With every other bundle added to the configuration the startup time extends, even if these bundles have explicit dependencies to others (Require-Bundle) and no "uses-"-constraints in their export packages.

I suppose that the webservice libraries cause the bad performance because leaving them out increases startup time clearly.

I hope this helps. Let me know if you need further information.
Comment 11 Pascal Rapicault CLA 2009-03-12 21:53:43 EDT
Stefan, in comment #7 I was talking about the p2 resolver (SAT-based solver) and not about the current OSGi the solver. This is where the confusion is coming from.
Comment 12 David Williams CLA 2015-10-07 13:58:39 EDT
Is this old plan item now fixed? (With resolver work done in Mars?)
Comment 13 Thomas Watson CLA 2015-10-07 14:58:29 EDT
(In reply to David Williams from comment #12)
> Is this old plan item now fixed? (With resolver work done in Mars?)

Yes, I think we should close this as addressed in Mars and any future issues should be tracked in a new bug.  This bug report was mainly discussing improvements to the old implementation of the resolver which is not used anymore in Equinox so any new issues will have to be addressed in a different way and should be done in a different bug.