Community
Participate
Working Groups
Take a large set of projects and files in project explorer and doing an expand all using "*" on the numeric keyboard in Windows. A noticeable (8s out of 37s) performance degradation has occurred. After profiling both 3.4.2 and 3.4.2 with YourKit and comparing the snapshots, there was a noticeable difference in the time taken to evaluate org.eclipse.core.internal.Expressions.isSubtype() from 31ms to 5328ms. The cause seems to be the addition of a new Key class, which states that it was first introduced in 3.4.1. The majority of the performance difference seems to be in the construction of the Key object to be used to look in to the cache. The project explorer is quite susceptible to changes in the enablement expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype hierarchy made a very large difference in PE performance in the past). Perhaps the project explorer needs a way to back out this new classloader aware isSubType() method because it is unlikely that there will be two different versions of the same class loaded by two different class loaders in a Project Explorer case.
The new key was added by 234919 [expressions] instanceof map should contain cleanup code https://bugs.eclipse.org/bugs/show_bug.cgi?id=234919 This was released to 3.4 maintainance stream on 20081126 https://bugs.eclipse.org/bugs/show_bug.cgi?id=234919#c16
(In reply to comment #0) > The project explorer is quite susceptible to changes in the enablement > expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype > hierarchy made a very large difference in PE performance in the past). Perhaps > the project explorer needs a way to back out this new classloader aware > isSubType() method because it is unlikely that there will be two different > versions of the same class loaded by two different class loaders in a Project > Explorer case. > It only matters that 2 different plugins provide the same class FQCN (even if the classes themselves are not related). I don't believe it's really that common, but it is definitely possible in an OSGi based environment. Does the CNF enable the Project Explorer to accept new content providers? PW
(In reply to comment #2) > (In reply to comment #0) > > The project explorer is quite susceptible to changes in the enablement > > expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype > > hierarchy made a very large difference in PE performance in the past). Perhaps > > the project explorer needs a way to back out this new classloader aware > > isSubType() method because it is unlikely that there will be two different > > versions of the same class loaded by two different class loaders in a Project > > Explorer case. > > > > > It only matters that 2 different plugins provide the same class FQCN (even if > the classes themselves are not related). > > I don't believe it's really that common, but it is definitely possible in an > OSGi based environment. > > Does the CNF enable the Project Explorer to accept new content providers? > > PW > You can dynamically enable content providers in the CNF if that's what you are asking
(In reply to comment #0) > Take a large set of projects and files in project explorer and doing an expand > all using "*" on the numeric keyboard in Windows. > > A noticeable (8s out of 37s) performance degradation has occurred. That Key class is necessary to make this code behaviourally correct. I assume it's still way better than pre-optimization. I'd be willing to re-visit the Key class and see if there's further optimizations that can be done. PW
I tried to measure this myself, but I don't see such performance differences. How exactly did you measure? What we could do is to take the classloader out of the Key, and store it in a new Value type. The fgKnownClasses map then points from Key to Value, and the classloader would only be compared when the keys are already equal.
Hi Markus, The CPU performance snapshots were taken in a project explorer expand-all use case. I didn't enable method invocation counting but I would imagine that the method is being invoked a large number of times, perhaps even tens of thousands of times, for all of the content providers that have enablement expressions based on the instanceof particular classes.
Created attachment 127200 [details] nested HashMaps for R3_4_maintenance I did some measurements with ExpressionTests.testSubTypeTiming() with TYPE_ITERATIONS = 1000000: * Current R3_4_maintenance: Cached Delta: 657 Instance Delta: 9344 * removed WeakReference fClassLoader from Key (incorrect): Cached Delta: 344 Instance Delta: 9344 * 3.4.0 version (incorrect): Cached Delta: 234 Instance Delta: 9297 * I also tried an implementation with nested HashMaps, but it wasn't much faster: Cached Delta: 546 Instance Delta: 9422 Measurements for Expand All in Project Explorer, using org.eclipse.jdt.ui.tests from HEAD * nested HashMap: 2.5s of 60s in Expressions.isSubtype * Current R3_4_maintenance: 3.1s of 60s in Expressions.isSubtype * 3.4.0 version (incorrect): 55ms of 60s in Expressions.isSubtype The attached version with nested HashMaps would be a bit faster (probably since it does not have to allocate objects for the lookup). We could release this for 3.5, but for 3.4.2, I don't think the little gain would be worth a feature patch. The incorrect version was indeed faster, but that's the price you have to pay for correctness. If that's not enough, you have to add caching to the Common Navigator, which evaluates the same expression over and over again.
I completely understand the issues of "incorrectness" of the initial patch for 3.4, however I am not aware of any real world use case that would not be solved by simply flushing the map when the bundle state changes. Do we really need this level of complication to such a simple problem?
(In reply to comment #8) > I completely understand the issues of "incorrectness" of the initial patch for > 3.4, however I am not aware of any real world use case that would not be solved > by simply flushing the map when the bundle state changes. This isn't dependent on bundle state ... I can have 2 bundles active right from the get-go that cause this problem (the bundles don't have to have anything to do with each other to create this problem). PW
Prior to the introduction of the cache how was this situation handled?
(In reply to comment #10) > Prior to the introduction of the cache how was this situation handled? It was not a problem before the cache, as it walked the contributed object's class hierarchy on every call (so you would always get a correct answer). PW
I think I understand the issues now. We're going to test a patch here and submit it if we're happy with it.
Created attachment 127323 [details] Proposed Fix I just tested this patch against the same test harness that I used to get the original numbers. I have found significant performance benefits using these changes. It is somewhere on the order of 7-10s faster in our use case, which should nullify the performance regression that we are reporting. I believe that this patch should handle the case that you have described because the key of the map is the Class object itself, not the fully qualified name. Class equality is determined by the identity of the Class object itself so any different versions of the same class will have their own entries in the cache. Also, to prevent leaking of Classes after the bundle has unloaded there is already the existing code that will clear the cache every time any bundle is unloaded. I hope that this patch satisfies all of the requirements. Chris
(In reply to comment #13) > Created an attachment (id=127323) [details] > Proposed Fix > I believe that this patch should handle the case that you have described > because the key of the map is the Class object itself, not the fully qualified > name. Class equality is determined by the identity of the Class object itself > so any different versions of the same class will have their own entries in the > cache. Also, to prevent leaking of Classes after the bundle has unloaded there > is already the existing code that will clear the cache every time any bundle is > unloaded. So the only reason that we had added the Key class in the first place was concern about holding onto Classes (and hence ClassLoaders) longer than their lifespan (no one wants to leak classloaders :-) If we changed the map with Class keys to a WeakHashMap and continue to clear it any time a bundle is unloaded, maybe that's enough protection against holding on to Classes too long. PW
Hi Paul, We had originally considered using a WeakHashMap but decided that it was not necessary since when any bundle unloads, the map is cleared, which should remove all references to any classes. By clearing all references to the classes, shouldn't this allow the class loader to be garbage collected as well? Am I missing something?
(In reply to comment #15) > Hi Paul, > > We had originally considered using a WeakHashMap but decided that it was not > necessary since when any bundle unloads, the map is cleared, which should > remove all references to any classes. By clearing all references to the > classes, shouldn't this allow the class loader to be garbage collected as well? This works fine as long as the shortest class+class loader lifecycle matches a bundle lifecycle. I can easily write more fluid code within my one bundle that breaks that assumption (although it's usually to handle dynamic loading of things not quite bundles, but based on the fact I've experimented with it in response to newsgroup queries it is being done). PW
This seems to be quite an unusual case. Someone would have to be creating ephemeral classes and objects of these classes would have to be evaluated against some expression and that expression would specifically have to have instanceof/adapt expressions in it. I imagine that this would happen only 0.01% of the time. Perhaps in this case, whoever is creating these ephemeral classes could be given a hook to clear out the cache at the end of the lifecycle of their classes/loaders. This way the vast majority of clients don't need to pay the price for the WeakHashMap, which is not as fast as a simple HashMap and can impact the performance of the garbage collector.
(In reply to comment #17) > This seems to be quite an unusual case. Hammering a core expression is quite an unusual case, we don't do it anywhere in the SDK :-) > Perhaps in this case, whoever is creating these ephemeral classes could be > given a hook to clear out the cache at the end of the lifecycle of their > classes/loaders. This way the vast majority of clients don't need to pay the > price for the WeakHashMap, which is not as fast as a simple HashMap and can > impact the performance of the garbage collector. Introducing API to work around a limitation in an optimization is not really acceptable. It seems to me we need the WeakHashMap to not leak classloaders (since in client code there's no guarantee they are tied to the life of the bundle). I assume WeakHashMap is still a performance improvement over the Key class? PW
(In reply to comment #18) > > This seems to be quite an unusual case. > Hammering a core expression is quite an unusual case, we don't do it anywhere > in the SDK :-) Isn't the CNF part of the SDK? :-) Its extension point really promotes the use of enablement expressions and a wide variety of contributors. > > Perhaps in this case, whoever is creating these ephemeral classes could be > > given a hook to clear out the cache at the end of the lifecycle of their > > classes/loaders. This way the vast majority of clients don't need to pay the > > price for the WeakHashMap, which is not as fast as a simple HashMap and can > > impact the performance of the garbage collector. > Introducing API to work around a limitation in an optimization is not really > acceptable. It seems to me we need the WeakHashMap to not leak classloaders > (since in client code there's no guarantee they are tied to the life of the > bundle). I assume WeakHashMap is still a performance improvement over the Key > class? > PW I just tested with the WeakHashMap and it should be as good as the HashMap for this use case. I have attached a new version of the patch.
Created attachment 127489 [details] Proposed Fix #2 (WeakHashMap)
If this meets Chris's performance needs and Paul's correctness needs then I am strongly in favor of this patch.
It just occurred to me that when we originally logged the performance issue with platform we demonstrated it was easily reproducible by opening the resource perspective and creating a generic project with hundreds of files in nested folders and performing an expand all. You will find that any of our plethora of GMF based products will also "hammer" this expression for almost any action with a diagram. Also have a look at the J2EE tooling or datatools... they make very heavy use of the expressions and conditional enablement in CNF.
(In reply to comment #21) > If this meets Chris's performance needs and Paul's correctness needs then I am > strongly in favor of this patch. > This looks good to me ... if there are no other concerns, I'll put this into 3.5 M6 tomorrow. PW
N.B. If I understand the patch correctly the Key class (in the same package) can also be deleted as it will no longer be referenced.
(In reply to comment #22) > It just occurred to me that when we originally logged the performance issue > with platform we demonstrated it was easily reproducible by opening the > resource perspective and creating a generic project with hundreds of files in > nested folders and performing an expand all. This doesn't come up while using the SDK to develop the SDK, but seems to be part of the "use of GMF" workflow (and I do mean the eclipse "classic" SDK in this case). > > You will find that any of our plethora of GMF based products will also "hammer" > this expression for almost any action with a diagram. Not the SDK. > > Also have a look at the J2EE tooling or datatools... they make very heavy use > of the expressions and conditional enablement in CNF. > Not the SDK. My point being that I have had occasion to write code that would cause the failure case, and I have never encountered this bug with heavy use of the SDK ... stating that this issue is not the common case for GMF is valid, of course, but the GMF usecase is not the common case for the SDK ... it's all perspective. PW
(In reply to comment #24) > N.B. If I understand the patch correctly the Key class (in the same package) > can also be deleted as it will no longer be referenced. yes, I'll make sure it goes away as well. PW
+1, but I would use WeakHashMap instead of just Map everywhere to make it clear that this is not a normal map. And I would copy the Javadoc from Key to fgKnownClasses.
(In reply to comment #27) > +1, but I would use WeakHashMap instead of just Map everywhere to make it clear > that this is not a normal map. And I would copy the Javadoc from Key to > fgKnownClasses. I'll make the update when I check this in tomorrow. PW
Created attachment 127652 [details] Fix #2 (WeakHashMap) as checked into 3.5
(In reply to comment #29) > Created an attachment (id=127652) [details] > Fix #2 (WeakHashMap) as checked into 3.5 > Released to HEAD >20090305 PW
Created attachment 127656 [details] Fix #2 that applies to 3.4.2
In HEAD PW
Comment on attachment 127489 [details] Proposed Fix #2 (WeakHashMap) Thanx for the patch, Chris. PW
In I20090310-0100 PW