Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 313447 - Avoid MD5 computation for publishing from dropins reconciler
Summary: Avoid MD5 computation for publishing from dropins reconciler
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.6.1   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2010-05-18 17:29 EDT by John Arthorne CLA
Modified: 2010-06-22 10:37 EDT (History)
5 users (show)

See Also:


Attachments
Profiler output from running reconciler app on large product after installing a feature (15.48 KB, image/png)
2010-06-02 09:59 EDT, John Arthorne CLA
no flags Details
Fix v01 (8.25 KB, patch)
2010-06-04 10:19 EDT, John Arthorne CLA
no flags Details | Diff
Fix v02 (9.43 KB, patch)
2010-06-04 13:21 EDT, John Arthorne CLA
no flags Details | Diff
YourKit hotspot list before fix (3.08 KB, text/plain)
2010-06-04 13:31 EDT, John Arthorne CLA
no flags Details
YourKit hotspot list after fix (12.97 KB, text/plain)
2010-06-04 13:33 EDT, John Arthorne CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2010-05-18 17:29:31 EDT
The MD5 computation seems to take a big chunk of time during startup when there are many bundles in the dropins folder. We should consider omitting computation of the MD5 property in this case since it isn't really needed for artifacts generated based on the dropins folder.
Comment 1 John Arthorne CLA 2010-05-19 09:29:21 EDT
I will investigate, and if it provides a big improvement we could consider it for 3.6.1.
Comment 2 John Arthorne CLA 2010-06-02 09:59:01 EDT
Created attachment 170801 [details]
Profiler output from running reconciler app on large product after installing a feature
Comment 3 John Arthorne CLA 2010-06-04 10:19:39 EDT
Created attachment 171104 [details]
Fix v01

This patch introduces a publisher option to avoid MD5 computation. This option is then used in the RepositoryListener (reconciler). All other callers of the publisher will get MD5 hashes by default.

This patch is very conservative in avoiding breaking any provisional API. If I was doing this in 3.7 I would clean this up and remove the old methods. In particular, there are several places where IPublisherInfo is not passed through where it is needed. If we ever promote this to real API we'll want the method signatures to pass IPublisherInfo around as much as possible to allow for these kinds of options in the future.
Comment 4 John Arthorne CLA 2010-06-04 10:22:42 EDT
Pascal, what do you think of this. I can't think of any scenarios where computing MD5 hashes in the reconciler is useful. From the profiler output we can see this has a big impact on performance which cancels most of our reconciler performance improvements since 3.4.2 (which didn't compute MD5 hash).
Comment 5 John Arthorne CLA 2010-06-04 13:12:22 EDT
This makes a huge difference to startup/reconcile time! Here is the scenario I tested:

1) Unzip Eclipse SDK 3.6 RC4 (I20100603-1500)
2) Startup once, shutdown.
3) Unzip the Helios modeling EPP package into dropins (1000 plugins, 232 features)
4) Startup.

Here is the result both with and without the patch:

3.6 RC4 Before patch:

Starting application: 130625
Application Started: 155484


3.6 RC4 after patch:

Starting application: 29329
Application Started: 36282

Total startup time is 4x faster!
Comment 6 John Arthorne CLA 2010-06-04 13:21:52 EDT
Created attachment 171131 [details]
Fix v02

This is the fix I was running with. The only difference from the previous one is that it also omits MD5 computation when publishing features.
Comment 7 John Arthorne CLA 2010-06-04 13:31:18 EDT
Created attachment 171135 [details]
YourKit hotspot list before fix
Comment 8 John Arthorne CLA 2010-06-04 13:33:37 EDT
Created attachment 171136 [details]
YourKit hotspot list after fix

I didn't believe those numbers, so I ran again before with and without the patch with a profile attached. I got similar results, and the two attachments show the hotspots (methods taking the most time) while running the scenario. 85% of the startup was taken by MD5 hash computation before the patch, which obviously disappears after the patch.
Comment 9 Gary Karasiuk CLA 2010-06-04 13:53:21 EDT
(In reply to comment #5)
This is great news!
Comment 10 Pascal Rapicault CLA 2010-06-04 14:00:35 EDT
Removing the MD5 computation is fine by me.
But I have an even better scenario: download RC4, use the p2 UI to install everything, restart. No time taken!
Comment 11 John Arthorne CLA 2010-06-22 10:37:25 EDT
Fix released for 3.6.1 and 3.7.