Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 313366

Summary: SiteListener canonicalizes filenames many times
Product: [Eclipse Project] Equinox Reporter: John Arthorne <john.arthorne>
Component: p2Assignee: John Arthorne <john.arthorne>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: pascal
Version: 3.6Keywords: performance
Target Milestone: 3.6 RC2Flags: pascal: review+
dj.houghton: review+
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
Profiler output
none
Fix v01 none

Description John Arthorne CLA 2010-05-18 10:36:13 EDT
Build: 3.6 RC1

When processing platform.xml, the site listener does this:

for each file X in the file system:
   for each file Y in USER_INCLUDE list:
      if (X.endsWith(new File(Y).toString())
         process file X

This means the block "new File(Y).toString() is executed X times for each file Y. In a very large install this adds up to a large performance cost during startup.

The purpose of "new File(Y).toString()" is to normalize the filenames - eliminate duplicate slashes, etc. We can make a simple optimization by normalizing all the file names at the beginning so we don't need to do this in the loop.
Comment 1 John Arthorne CLA 2010-05-18 10:37:46 EDT
Created attachment 168939 [details]
Profiler output
Comment 2 John Arthorne CLA 2010-05-18 11:02:21 EDT
Created attachment 168944 [details]
Fix v01

Simple fix to normalize filenames only once during construction
Comment 3 John Arthorne CLA 2010-05-18 11:05:13 EDT
Note that even with the fix, this is O(n^2) cost, and could potentially be made faster by sorting the list of files X. However because it currently does an endsWith comparison this fix would be fairly complicated and I'm not sure it would end up much faster. The main bottleneck here is the new java.io.File() construction, so removing that should eliminate this code as a hotspot.
Comment 4 John Arthorne CLA 2010-05-18 14:26:12 EDT
I think we should do this for RC2. The fix is quite simple and safe - really just moving the canonicalization outside the loop. I will re-run all the tests and then release.
Comment 5 John Arthorne CLA 2010-05-18 15:01:18 EDT
Fix released.