Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 266538 - [expressions] [Contributions] Performance degredation in expand all in the project explorer
Summary: [expressions] [Contributions] Performance degredation in expand all in the pr...
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.4.2   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M6   Edit
Assignee: Paul Webster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-27 13:57 EST by Anthony Hunter CLA
Modified: 2009-03-10 09:57 EDT (History)
6 users (show)

See Also:


Attachments
nested HashMaps for R3_4_maintenance (5.83 KB, patch)
2009-03-02 13:46 EST, Markus Keller CLA
no flags Details | Diff
Proposed Fix (3.04 KB, patch)
2009-03-03 10:02 EST, Chris McGee CLA
no flags Details | Diff
Proposed Fix #2 (WeakHashMap) (3.59 KB, patch)
2009-03-04 10:59 EST, Chris McGee CLA
pwebster: iplog+
Details | Diff
Fix #2 (WeakHashMap) as checked into 3.5 (5.79 KB, patch)
2009-03-05 09:07 EST, Paul Webster CLA
no flags Details | Diff
Fix #2 that applies to 3.4.2 (5.35 KB, patch)
2009-03-05 09:38 EST, Paul Webster CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anthony Hunter CLA 2009-02-27 13:57:28 EST
Take a large set of projects and files in project explorer and doing an expand all using "*" on the numeric keyboard in Windows.

A noticeable (8s out of 37s) performance degradation has occurred.

After profiling both 3.4.2 and 3.4.2 with YourKit and comparing the snapshots, there was a noticeable difference in the time taken to evaluate org.eclipse.core.internal.Expressions.isSubtype() from 31ms to 5328ms. The cause seems to be the addition of a new Key class, which states that it was first introduced in 3.4.1. The majority of the performance difference seems to be in the construction of the Key object to be used to look in to the cache.

The project explorer is quite susceptible to changes in the enablement expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype hierarchy made a very large difference in PE performance in the past). Perhaps the project explorer needs a way to back out this new classloader aware isSubType() method because it is unlikely that there will be two different versions of the same class loaded by two different class loaders in a Project Explorer case.
Comment 1 Alex Fitzpatrick CLA 2009-02-27 14:10:13 EST
The new key was added by 234919 [expressions] instanceof map should contain cleanup code 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=234919

This was released to 3.4 maintainance stream on 20081126
https://bugs.eclipse.org/bugs/show_bug.cgi?id=234919#c16
Comment 2 Paul Webster CLA 2009-02-27 14:22:27 EST
(In reply to comment #0)
> The project explorer is quite susceptible to changes in the enablement
> expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype
> hierarchy made a very large difference in PE performance in the past). Perhaps
> the project explorer needs a way to back out this new classloader aware
> isSubType() method because it is unlikely that there will be two different
> versions of the same class loaded by two different class loaders in a Project
> Explorer case.
> 


It only matters that 2 different plugins provide the same class FQCN (even if the classes themselves are not related).

I don't believe it's really that common, but it is definitely possible in an OSGi based environment.

Does the CNF enable the Project Explorer to accept new content providers?

PW


Comment 3 Francis Upton IV CLA 2009-02-27 15:43:28 EST
(In reply to comment #2)
> (In reply to comment #0)
> > The project explorer is quite susceptible to changes in the enablement
> > expression code (n.b. Alex Fitzpatrick's fix to cache class subtype/supertype
> > hierarchy made a very large difference in PE performance in the past). Perhaps
> > the project explorer needs a way to back out this new classloader aware
> > isSubType() method because it is unlikely that there will be two different
> > versions of the same class loaded by two different class loaders in a Project
> > Explorer case.
> > 
> 
> 
> It only matters that 2 different plugins provide the same class FQCN (even if
> the classes themselves are not related).
> 
> I don't believe it's really that common, but it is definitely possible in an
> OSGi based environment.
> 
> Does the CNF enable the Project Explorer to accept new content providers?
> 
> PW
> 
You can dynamically enable content providers in the CNF if that's what you are asking

Comment 4 Paul Webster CLA 2009-03-02 08:51:16 EST
(In reply to comment #0)
> Take a large set of projects and files in project explorer and doing an expand
> all using "*" on the numeric keyboard in Windows.
> 
> A noticeable (8s out of 37s) performance degradation has occurred.

That Key class is necessary to make this code behaviourally correct.  I assume it's still way better than pre-optimization.

I'd be willing to re-visit the Key class and see if there's further optimizations that can be done.

PW

Comment 5 Markus Keller CLA 2009-03-02 10:07:43 EST
I tried to measure this myself, but I don't see such performance differences. How exactly did you measure?

What we could do is to take the classloader out of the Key, and store it in a new Value type. The fgKnownClasses map then points from Key to Value, and the classloader would only be compared when the keys are already equal.
Comment 6 Chris McGee CLA 2009-03-02 10:27:31 EST
Hi Markus,

The CPU performance snapshots were taken in a project explorer expand-all use case. I didn't enable method invocation counting but I would imagine that the method is being invoked a large number of times, perhaps even tens of thousands of times, for all of the content providers that have enablement expressions based on the instanceof particular classes.
Comment 7 Markus Keller CLA 2009-03-02 13:46:34 EST
Created attachment 127200 [details]
nested HashMaps for R3_4_maintenance

I did some measurements with ExpressionTests.testSubTypeTiming() with TYPE_ITERATIONS = 1000000:

* Current R3_4_maintenance:
Cached Delta: 657
Instance Delta: 9344

* removed WeakReference fClassLoader from Key (incorrect):
Cached Delta: 344
Instance Delta: 9344

* 3.4.0 version (incorrect):
Cached Delta: 234
Instance Delta: 9297

* I also tried an implementation with nested HashMaps, but it wasn't much faster:
Cached Delta: 546
Instance Delta: 9422


Measurements for Expand All in Project Explorer, using org.eclipse.jdt.ui.tests from HEAD

* nested HashMap:
2.5s of 60s in Expressions.isSubtype

* Current R3_4_maintenance:
3.1s of 60s in Expressions.isSubtype

* 3.4.0 version (incorrect):
55ms of 60s in Expressions.isSubtype

The attached version with nested HashMaps would be a bit faster (probably since it does not have to allocate objects for the lookup). We could release this for 3.5, but for 3.4.2, I don't think the little gain would be worth a feature patch. The incorrect version was indeed faster, but that's the price you have to pay for correctness.

If that's not enough, you have to add caching to the Common Navigator, which evaluates the same expression over and over again.
Comment 8 Alex Fitzpatrick CLA 2009-03-02 14:16:37 EST
I completely understand the issues of "incorrectness" of the initial patch for 3.4, however I am not aware of any real world use case that would not be solved by simply flushing the map when the bundle state changes.

Do we really need this level of complication to such a simple problem?
Comment 9 Paul Webster CLA 2009-03-02 14:21:18 EST
(In reply to comment #8)
> I completely understand the issues of "incorrectness" of the initial patch for
> 3.4, however I am not aware of any real world use case that would not be solved
> by simply flushing the map when the bundle state changes.

This isn't dependent on bundle state ... I can have 2 bundles active right from the get-go that cause this problem (the bundles don't have to have anything to do with each other to create this problem).

PW
Comment 10 Alex Fitzpatrick CLA 2009-03-02 14:28:51 EST
Prior to the introduction of the cache how was this situation handled?
Comment 11 Paul Webster CLA 2009-03-02 14:42:00 EST
(In reply to comment #10)
> Prior to the introduction of the cache how was this situation handled?

It was not a problem before the cache, as it walked the contributed object's class hierarchy on every call (so you would always get a correct answer).

PW
Comment 12 Alex Fitzpatrick CLA 2009-03-02 15:32:13 EST
I think I understand the issues now. We're going to test a patch here and submit it if we're happy with it.
Comment 13 Chris McGee CLA 2009-03-03 10:02:57 EST
Created attachment 127323 [details]
Proposed Fix

I just tested this patch against the same test harness that I used to get the original numbers. I have found significant performance benefits using these changes. It is somewhere on the order of 7-10s faster in our use case, which should nullify the performance regression that we are reporting.

I believe that this patch should handle the case that you have described because the key of the map is the Class object itself, not the fully qualified name. Class equality is determined by the identity of the Class object itself so any different versions of the same class will have their own entries in the cache. Also, to prevent leaking of Classes after the bundle has unloaded there is already the existing code that will clear the cache every time any bundle is unloaded.

I hope that this patch satisfies all of the requirements.

Chris
Comment 14 Paul Webster CLA 2009-03-03 15:45:24 EST
(In reply to comment #13)
> Created an attachment (id=127323) [details]
> Proposed Fix

> I believe that this patch should handle the case that you have described
> because the key of the map is the Class object itself, not the fully qualified
> name. Class equality is determined by the identity of the Class object itself
> so any different versions of the same class will have their own entries in the
> cache. Also, to prevent leaking of Classes after the bundle has unloaded there
> is already the existing code that will clear the cache every time any bundle is
> unloaded.


So the only reason that we had added the Key class in the first place was concern about holding onto Classes (and hence ClassLoaders) longer than their lifespan (no one wants to leak classloaders :-)

If we changed the map with Class keys to a WeakHashMap and continue to clear it any time a bundle is unloaded, maybe that's enough protection against holding on to Classes too long.

PW
Comment 15 Chris McGee CLA 2009-03-03 15:53:02 EST
Hi Paul,

We had originally considered using a WeakHashMap but decided that it was not necessary since when any bundle unloads, the map is cleared, which should remove all references to any classes. By clearing all references to the classes, shouldn't this allow the class loader to be garbage collected as well?

Am I missing something?

Comment 16 Paul Webster CLA 2009-03-04 08:28:54 EST
(In reply to comment #15)
> Hi Paul,
> 
> We had originally considered using a WeakHashMap but decided that it was not
> necessary since when any bundle unloads, the map is cleared, which should
> remove all references to any classes. By clearing all references to the
> classes, shouldn't this allow the class loader to be garbage collected as well?

This works fine as long as the shortest class+class loader lifecycle matches a bundle lifecycle.  I can easily write more fluid code within my one bundle that breaks that assumption (although it's usually to handle dynamic loading of things not quite bundles, but based on the fact I've experimented with it in response to newsgroup queries it is being done).

PW
Comment 17 Chris McGee CLA 2009-03-04 09:30:12 EST
This seems to be quite an unusual case.

Someone would have to be creating ephemeral classes and objects of these classes would have to be evaluated against some expression and that expression would specifically have to have instanceof/adapt expressions in it. I imagine that this would happen only 0.01% of the time.

Perhaps in this case, whoever is creating these ephemeral classes could be given a hook to clear out the cache at the end of the lifecycle of their classes/loaders. This way the vast majority of clients don't need to pay the price for the WeakHashMap, which is not as fast as a simple HashMap and can impact the performance of the garbage collector.
Comment 18 Paul Webster CLA 2009-03-04 09:52:14 EST
(In reply to comment #17)
> This seems to be quite an unusual case.

Hammering a core expression is quite an unusual case, we don't do it anywhere in the SDK :-)

> Perhaps in this case, whoever is creating these ephemeral classes could be
> given a hook to clear out the cache at the end of the lifecycle of their
> classes/loaders. This way the vast majority of clients don't need to pay the
> price for the WeakHashMap, which is not as fast as a simple HashMap and can
> impact the performance of the garbage collector.

Introducing API to work around a limitation in an optimization is not really acceptable.  It seems to me we need the WeakHashMap to not leak classloaders (since in client code there's no guarantee they are tied to the life of the bundle). I assume WeakHashMap is still a performance improvement over the Key class?

PW
Comment 19 Chris McGee CLA 2009-03-04 10:57:45 EST
(In reply to comment #18)
> > This seems to be quite an unusual case.
> Hammering a core expression is quite an unusual case, we don't do it anywhere
> in the SDK :-)

Isn't the CNF part of the SDK? :-)

Its extension point really promotes the use of enablement expressions and a wide variety of contributors.

> > Perhaps in this case, whoever is creating these ephemeral classes could be
> > given a hook to clear out the cache at the end of the lifecycle of their
> > classes/loaders. This way the vast majority of clients don't need to pay the
> > price for the WeakHashMap, which is not as fast as a simple HashMap and can
> > impact the performance of the garbage collector.

> Introducing API to work around a limitation in an optimization is not really
> acceptable.  It seems to me we need the WeakHashMap to not leak classloaders
> (since in client code there's no guarantee they are tied to the life of the
> bundle). I assume WeakHashMap is still a performance improvement over the Key
> class?
> PW

I just tested with the WeakHashMap and it should be as good as the HashMap for this use case. I have attached a new version of the patch.
Comment 20 Chris McGee CLA 2009-03-04 10:59:08 EST
Created attachment 127489 [details]
Proposed Fix #2 (WeakHashMap)
Comment 21 Alex Fitzpatrick CLA 2009-03-04 11:18:09 EST
If this meets Chris's performance needs and Paul's correctness needs then I am strongly in favor of this patch.
Comment 22 Alex Fitzpatrick CLA 2009-03-04 11:26:19 EST
It just occurred to me that when we originally logged the performance issue with platform we demonstrated it was easily reproducible by opening the resource perspective and creating a generic project with hundreds of files in nested folders and performing an expand all.

You will find that any of our plethora of GMF based products will also "hammer" this expression for almost any action with a diagram.

Also have a look at the J2EE tooling or datatools... they make very heavy use of the expressions and conditional enablement in CNF.
Comment 23 Paul Webster CLA 2009-03-04 11:45:20 EST
(In reply to comment #21)
> If this meets Chris's performance needs and Paul's correctness needs then I am
> strongly in favor of this patch.
> 

This looks good to me ... if there are no other concerns, I'll put this into 3.5 M6 tomorrow.

PW


Comment 24 Alex Fitzpatrick CLA 2009-03-04 12:36:29 EST
N.B. If I understand the patch correctly the Key class (in the same package) can also be deleted as it will no longer be referenced.
Comment 25 Paul Webster CLA 2009-03-04 12:46:31 EST
(In reply to comment #22)
> It just occurred to me that when we originally logged the performance issue
> with platform we demonstrated it was easily reproducible by opening the
> resource perspective and creating a generic project with hundreds of files in
> nested folders and performing an expand all.

This doesn't come up while using the SDK to develop the SDK, but seems to be part of the "use of GMF" workflow (and I do mean the eclipse "classic" SDK in this case).

> 
> You will find that any of our plethora of GMF based products will also "hammer"
> this expression for almost any action with a diagram.

Not the SDK.

> 
> Also have a look at the J2EE tooling or datatools... they make very heavy use
> of the expressions and conditional enablement in CNF.
> 

Not the SDK.

My point being that I have had occasion to write code that would cause the failure case, and I have never encountered this bug with heavy use of the SDK ... stating that this issue is not the common case for GMF is valid, of course, but the GMF usecase is not the common case for the SDK ... it's all perspective.

PW
Comment 26 Paul Webster CLA 2009-03-04 12:46:53 EST
(In reply to comment #24)
> N.B. If I understand the patch correctly the Key class (in the same package)
> can also be deleted as it will no longer be referenced.

yes, I'll make sure it goes away as well.

PW
Comment 27 Markus Keller CLA 2009-03-04 15:25:04 EST
+1, but I would use WeakHashMap instead of just Map everywhere to make it clear
that this is not a normal map. And I would copy the Javadoc from Key to
fgKnownClasses.
Comment 28 Paul Webster CLA 2009-03-04 15:27:03 EST
(In reply to comment #27)
> +1, but I would use WeakHashMap instead of just Map everywhere to make it clear
> that this is not a normal map. And I would copy the Javadoc from Key to
> fgKnownClasses.

I'll make the update when I check this in tomorrow.

PW
Comment 29 Paul Webster CLA 2009-03-05 09:07:48 EST
Created attachment 127652 [details]
Fix #2 (WeakHashMap) as checked into 3.5
Comment 30 Paul Webster CLA 2009-03-05 09:37:30 EST
(In reply to comment #29)
> Created an attachment (id=127652) [details]
> Fix #2 (WeakHashMap) as checked into 3.5
> 

Released to HEAD >20090305
PW
Comment 31 Paul Webster CLA 2009-03-05 09:38:19 EST
Created attachment 127656 [details]
Fix #2 that applies to 3.4.2
Comment 32 Paul Webster CLA 2009-03-05 09:39:35 EST
In HEAD
PW
Comment 33 Paul Webster CLA 2009-03-05 09:40:10 EST
Comment on attachment 127489 [details]
Proposed Fix #2 (WeakHashMap)

Thanx for the patch, Chris.
PW
Comment 34 Paul Webster CLA 2009-03-10 09:57:30 EDT
In I20090310-0100
PW