Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 313366 - SiteListener canonicalizes filenames many times
Summary: SiteListener canonicalizes filenames many times
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.6 RC2   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2010-05-18 10:36 EDT by John Arthorne CLA
Modified: 2010-05-18 15:01 EDT (History)
1 user (show)

See Also:
pascal: review+
dj.houghton: review+


Attachments
Profiler output (9.15 KB, image/png)
2010-05-18 10:37 EDT, John Arthorne CLA
no flags Details
Fix v01 (2.58 KB, patch)
2010-05-18 11:02 EDT, John Arthorne CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2010-05-18 10:36:13 EDT
Build: 3.6 RC1

When processing platform.xml, the site listener does this:

for each file X in the file system:
   for each file Y in USER_INCLUDE list:
      if (X.endsWith(new File(Y).toString())
         process file X

This means the block "new File(Y).toString() is executed X times for each file Y. In a very large install this adds up to a large performance cost during startup.

The purpose of "new File(Y).toString()" is to normalize the filenames - eliminate duplicate slashes, etc. We can make a simple optimization by normalizing all the file names at the beginning so we don't need to do this in the loop.
Comment 1 John Arthorne CLA 2010-05-18 10:37:46 EDT
Created attachment 168939 [details]
Profiler output
Comment 2 John Arthorne CLA 2010-05-18 11:02:21 EDT
Created attachment 168944 [details]
Fix v01

Simple fix to normalize filenames only once during construction
Comment 3 John Arthorne CLA 2010-05-18 11:05:13 EDT
Note that even with the fix, this is O(n^2) cost, and could potentially be made faster by sorting the list of files X. However because it currently does an endsWith comparison this fix would be fairly complicated and I'm not sure it would end up much faster. The main bottleneck here is the new java.io.File() construction, so removing that should eliminate this code as a hotspot.
Comment 4 John Arthorne CLA 2010-05-18 14:26:12 EDT
I think we should do this for RC2. The fix is quite simple and safe - really just moving the canonicalization outside the loop. I will re-run all the tests and then release.
Comment 5 John Arthorne CLA 2010-05-18 15:01:18 EDT
Fix released.