| Summary: | [planner] Planner does not find the solution with the default time out settings | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | Pascal Rapicault <pascal> | ||||||
| Component: | p2 | Assignee: | Pascal Rapicault <pascal> | ||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||
| Severity: | normal | ||||||||
| Priority: | P3 | CC: | dj.houghton, john.arthorne, leberre, natalia.bartol, pwebster | ||||||
| Version: | 3.6 | ||||||||
| Target Milestone: | 3.6 M6 | ||||||||
| Hardware: | PC | ||||||||
| OS: | Mac OS X - Carbon (unsup.) | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Pascal Rapicault
Actually I had to put the timeout to 20000, which means that the install now takes 160 sec to resolve. Hum, 160s is too much for an installation. I will take a look at the testcase. This is especially long that it is happening while reconciling on startup. Someone tried to sort the IU and the requirements and has been able to get the expected results but this sounds shaky too me which is why I'm not proposing this. Though, what I'm wondering is if we could influence the search space by simply sorting the variables on the optimization function. (In reply to comment #3) > This is especially long that it is happening while reconciling on startup. > Someone tried to sort the IU and the requirements and has been able to get the > expected results but this sounds shaky too me which is why I'm not proposing > this. > Though, what I'm wondering is if we could influence the search space by simply > sorting the variables on the optimization function. The problem is, that P2 solver not only returns the non-optimal solution but also returned solutions are different if test is run few times. Here are plans that I received running test case three times: The plan: [R]com.dcns.rsm.profile.equipment 1.0.4.v20090831 will be installed [R]com.dcns.rsm.profile.gemo 3.7.2.v20100108 will be installed [R]com.dcns.rsm.profile.system 4.2.2.v20100112 will be installed [R]com.dcns.rsm.rda 5.1.0.v20100112 will be installed The plan: [R]com.dcns.rsm.profile.equipment 1.2.2.v20100108 will be installed [R]com.dcns.rsm.profile.gemo 3.7.2.v20100108 will be installed [R]com.dcns.rsm.profile.system 4.2.2.v20100112 will be installed [R]com.dcns.rsm.rda 5.1.0.v20100112 will be installed The plan: [R]com.dcns.rsm.profile.equipment 1.2.1.v20100106 will be installed [R]com.dcns.rsm.profile.gemo 3.7.2.v20100108 will be installed [R]com.dcns.rsm.profile.system 4.2.2.v20100112 will be installed [R]com.dcns.rsm.rda 5.1.0.v20100112 will be installed Sorting bundles before passing them to sat4j causes the returned plan to be always THE SAME, however - it it usually INCORRECT (not the highest versions). Sorting the variables on the optimization function will not improve this situation. Natalia, The behavior you observe is "normal". SAT4J is deterministic in the sense that for a given input it will always output the same answer. This is actually what happens when you sort the bundles: they are always presented to SAT4J in the same order, so the output is the same. When you do not sort the bundles, the bundles are ordered according to their memory address. That is the reason why when you run several times the same test case, you may have different non optimal solutions (because in that case the input is different). Now, the statement saying that the solution provided is INCORRECT is wrong. The solution provided in NON OPTIMAL, i.e. you will get a fully working version of your software. The issue is that it is not picking the latest version, because the solver outputs the best solution found SO FAR. With an increased timeout, you will likely get always the same result (optimal solution). To improve the current behavior, I need to take a close look at the test case. This "normal" behavior works as if P2 picks up random plug-ins during startup... This is not what users expect. I run test case with Eclipse 3.5 and I could not reproduce problem. So this seems to occur only on 3.4. Increasing timeout in Eclipse 3.4,even more than 20000, does not help. Is backport from 3.5 possible? which version of 3.4 are you using? We fixed an issue in the encoding in 3.4.2: bug #267518 That is the version that ships with 3.5. There is a new objective function in 3.6M5. bug #259537 (In reply to comment #7) > which version of 3.4 are you using? > > We fixed an issue in the encoding in 3.4.2: bug #267518 > > That is the version that ships with 3.5. > > There is a new objective function in 3.6M5. bug #259537 I tested with classes from 3_4_maintenance branch. As I see patch for bug #267518 is applied there, but this does not fix the problem. Which VM are you running? (In reply to comment #9) > Which VM are you running? C:\Program Files\IBM\SDP_RSA_2\jdk\jre\bin>java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pwi3260sr5ifix-20090824_02(SR5+IZ53892+IZ58949+IZ53194+IZ43801+IZ58796+IZ51441)) IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Windows XP x86-32 jvmwi3260sr5ifx-20090821_41076 (JIT enabled, AOT enabled) J9VM - 20090821_041076_lHdSMr JIT - r9_20090518_2017 GC - 20090417_AA) JCL - 20090824_02 Eclipse 3.4 uses SAT4J as an external tool. It creates a file with .opb extension in your temp directory when it does the resolution. Could you attach to this bug one of the generated opb file when the resolver is providing a suboptimal solution? Eclipse 3.5 does ships with an updated version of SAT4J (2.1.x vs 2.0.x in 3.4). You might want to try if updating SAT4J does solve the problem. Created attachment 158219 [details]
p2Encoding6582283682147754709.opb
.opb file generated by solver
(In reply to comment #11) > Eclipse 3.4 uses SAT4J as an external tool. > > It creates a file with .opb extension in your temp directory when it does the > resolution. > > Could you attach to this bug one of the generated opb file when the resolver is > providing a suboptimal solution? > > Eclipse 3.5 does ships with an updated version of SAT4J (2.1.x vs 2.0.x in > 3.4). You might want to try if updating SAT4J does solve the problem. I've tried with SAT4J 2.1.x - no improvement. I've attached attaching .opb file Returned plan is: The plan: [R]com.dcns.rsm.profile.equipment 1.2.2.v20100108 will be installed [R]com.dcns.rsm.profile.gemo 3.7.2.v20100108 will be installed [R]com.dcns.rsm.profile.system 4.0.0.v20091030 will be installed (should be 4.2.2.v20100112) [R]com.dcns.rsm.rda 5.1.0.v20100112 will be installed Thanks Natalia! I can test several strategies available in SAT4J on that problem now. I would not be pessimistic, but that instance has 15K variables and 42K constraints. This is not trivial for a PB solver... :( (In reply to comment #14) > Thanks Natalia! > > I can test several strategies available in SAT4J on that problem now. > > I would not be pessimistic, but that instance has 15K variables and 42K > constraints. > > This is not trivial for a PB solver... :( Oh, that's true. I'm wondering why nobody noticed such behavior before... As you mentioned it should be "normal" ;> Bad news. The problem is not on SAT4J side. The optimal solution is found with any version of the solver within 2 seconds. We are having here an optimization function issue: the value of the objective function for all the solutions you got are the same. (thus the solver can output any of them) You focus your attention on the following packages: R]com.dcns.rsm.profile.equipment 1.2.2.v20100108 will be installed [R]com.dcns.rsm.profile.gemo 3.7.2.v20100108 will be installed [R]com.dcns.rsm.profile.system 4.0.0.v20091030 will be installed (should be 4.2.2.v20100112) [R]com.dcns.rsm.rda 5.1.0.v20100112 will be installed However, I am pretty sure that there are some dependencies that prevent com.dcns.rsm.profile.system to be installed together with at least one other recent plugin (call it X). When you get an optimal solution for com.dcns.rsm.profile.system, you will not get an optimal one for X, and vice versa. For SAT4J, both solutions are perfectly equivalent. (I know that for you they are not :)) The best thing we can do is to study in a deeper way this problem to see if the objective function can be improved... (In reply to comment #16) > However, I am pretty sure that there are some dependencies that prevent > com.dcns.rsm.profile.system to be installed together with at least one other > recent plugin (call it X). > > When you get an optimal solution for com.dcns.rsm.profile.system, you will not > get an optimal one for X, and vice versa. > > For SAT4J, both solutions are perfectly equivalent. (I know that for you they > are not :)) > But during tests I've seen many times that all the highest versions were picked up. So I don't believe there are conflicting dependencies... And for SAT4J such solution is also optimal... but unfortunately not every time. Again, I guess you mean all the highest version of the bundles that you care about, right? I am talking about the global solution (with all the bundles). If it is possible for you to attach two full solutions (including all packages to install), one with all your bundles at the highest version, and another one not, I could show you my point on a real scenario. (In reply to comment #18) > Again, I guess you mean all the highest version of the bundles that you care > about, right? > > I am talking about the global solution (with all the bundles). > > If it is possible for you to attach two full solutions (including all packages > to install), one with all your bundles at the highest version, and another one > not, I could show you my point on a real scenario. Ok, I understand. I've seen a variety of different versions of different plug-ins to be picked up... So we have a plenty of optimal solutions... As I see in Projector class, you was the one who fixed the encoding and the optimization function. What changes caused that in 3.5 this situation is stable? :) The main change in 3.5 was to tightly integrate SAT4J with p2. I do not think that we are much "stable" in 3.5: we did not make any fundamental change on the objective function. The pretty same situation could occur. Are you using patches to install your product? (In reply to comment #20) > The main change in 3.5 was to tightly integrate SAT4J with p2. > > I do not think that we are much "stable" in 3.5: we did not make any > fundamental change on the objective function. The pretty same situation could > occur. > > Are you using patches to install your product? I'm not using patches. I just have a set of plug-ins and I add them to dropins folder. If in 3.5 this is quite the same mechanism why does it work correctly? Is this just coincidence that the returned solution is optimal from user point of view...? Is this something special about this set of plug-ins in dropins? I've tested also with simple "hello world" plug-in in three versions and also not the highest version was picked up. Created attachment 158465 [details]
CorePluginsUnistalledTestCase
I've observed another strange behavior of P2 planner. Installation of feature patch causes plug-ins like equinox.launcher or p2.reconciler.dropins to be uninstalled.
As a result configuration is broken and environment does not start.
Test case showing this issue attached. It contains also .opb file generated for this case.
Plan returned:
The plan:
[R]org.eclipse.core.contenttype 3.3.1.R34x_v20090604 will be replaced with [R]org.eclipse.core.contenttype 3.3.1.R34x_v20090825-1137
[R]org.eclipse.equinox.app 1.1.0.v20080421-2006 will be replaced with [R]org.eclipse.equinox.app 1.1.1.R34x_v20091203
[R]org.eclipse.osgi 3.4.3.R34x_v20081215-1030 will be replaced with [R]org.eclipse.osgi 3.4.4.R34x_v20091203
[R]org.eclipse.rcp.R342patch.feature.group 1.0.4 will be replaced with [R]org.eclipse.rcp.R342patch.feature.group 1.0.11
[R]org.eclipse.rcp.R342patch.feature.jar 1.0.4 will be replaced with [R]org.eclipse.rcp.R342patch.feature.jar 1.0.11
[R]org.eclipse.swt 3.4.2.v3452d will be replaced with [R]org.eclipse.swt 3.4.2.v3453a
[R]org.eclipse.swt.win32.win32.x86 3.4.1.v3452d will be replaced with [R]org.eclipse.swt.win32.win32.x86 3.4.1.v3453a
[R]org.eclipse.ui.workbench 3.4.2.M20090127-1700 will be replaced with [R]org.eclipse.ui.workbench 3.4.2.r342_v20091113-1600
[R]bootstrap 1.0.2.3422I20090915 will be uninstalled
[R]bootstrap.categoryIU 0.0.0 will be uninstalled
config.a.jre 1.6.0 will be uninstalled
[R]org.eclipse.equinox.feature.group 3.4.1.R342_v20090126-7w7TENgETuNblxYRhBOLydU3ADDC will be uninstalled
[R]org.eclipse.launcher_eclipse.exe 1.0.0 will be uninstalled
[R]org.eclipse.launcher_eclipse.exe.eclipse 1.0.0 will be uninstalled
tooling.org.eclipse.update.feature.default 1.0.0 will be uninstalled
tooling.osgi.bundle.default 1.0.0 will be uninstalled
tooling.source.default 1.0.0 will be uninstalled
toolingorg.eclipse.equinox.launcher 1.0.101.R34x_v20081125 will be uninstalled
toolingorg.eclipse.equinox.p2.reconciler.dropins 1.0.5.v20090307-1115 will be uninstalled
toolingorg.eclipse.equinox.simpleconfigurator 1.0.0.v20080604 will be uninstalled
[R]toolingorg.eclipse.launcher_eclipse.exe 1.0.0 will be uninstalled
What can be seen in configuration/.log file:
!ENTRY org.eclipse.equinox.p2.reconciler.dropins 4 0 2010-02-04 18:59:39.125
!MESSAGE
!STACK 0
org.osgi.framework.BundleException: State change in progress for bundle
"reference:file:../SDPShared\plugins\org.eclipse.equinox.p2.reconciler.dropins_1.0.5.v20090307-1115.jar"
by thread
"Start Level Event Dispatcher".
at org.eclipse.osgi.framework.internal.core.AbstractBundle.beginStateChange(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.AbstractBundle.suspend(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.Framework.suspendBundle(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.suspendBundle(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.processDelta(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.doResolveBundles(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl$1.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.eclipse.osgi.framework.internal.core.AbstractBundle$BundleStatusException
... 8 more
Root exception:
org.eclipse.osgi.framework.internal.core.AbstractBundle$BundleStatusException
at org.eclipse.osgi.framework.internal.core.AbstractBundle.beginStateChange(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.AbstractBundle.suspend(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.Framework.suspendBundle(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.suspendBundle(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.processDelta(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl.doResolveBundles(Unknown
Source)
at org.eclipse.osgi.framework.internal.core.PackageAdminImpl$1.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
!ENTRY org.eclipse.equinox.p2.metadata.repository 4 0 2010-02-04 18:59:46.218
!MESSAGE ProvisioningEventBus could not be obtained. Metadata caches
may not be cleaned up properly.
!ENTRY org.eclipse.equinox.p2.garbagecollector 4 0 2010-02-04 18:59:46.234
!MESSAGE ProvisioningEventBus service could not be obtained,
CoreGarbageCollector will not function properly.
The main difference between the dropins and classical update site is that the dependencies are set as optional. The optimization function in 3.5 puts more weights on the installation of optional packages, to handle specific test cases like this. That is the reason why you are not seeing that problem in 3.5 or ongoing 3.6. The solution would be to backport that part of the optimization function to 3.4. Daniel is this perhaps related to/a duplicate of bug 267518? Hum, it might be case. I have to dig into the details. A fix for bug 267518 is supposed to appear in 3_4_maintenance. I am aware of one situation where plugins could not be installed if they should be optionally installed and have optional requirements (two levels of optional dependencies) and missing dependencies on those requirements. I need to check if we are entering that case. The product encountering this problem appears to be running with the fix for bug 267518 that was back-ported into 3.4.x stream, so this might be a different case. I created two new bug reports to split all problems with P2 planner mentioned withing this bug: 1) A new one for the problem of core plug-ins being uninstalled during installation of feature patch: Bug 302580. 2) Bug 302582 -[planner] P2 does not pick up higher version of already installed plug-in from dropins. This bug contains test case for simple "hello world" plug-ins. I believe that everything in here as been addressed. |