Community
Participate
Working Groups
Created attachment 255067 [details] Screenshot: YourKit - UnixFileNatives#fetchFileInfo without change I was interested if switching Git branches with EGit can be improved in performance and profiled this action. The use case is based on the CDT repository, with a workspace set up with Oomph and thus all projects imported. It turned out that a significant part of the time was spent in UnixFileSystem#list(File) (8.664ms), which was invoked by File#list(FilenameFilter). Almost the whole list() commands were fired by UnixFileNatives#fetchFileInfo(). This is called during the ResourceRefreshJob, when the whole workspace is refreshed. When I debugged into UnixFileNatives#fetchFileInfo() I found that File#list() was called on the parent of the single files/directories. For each refreshed directory list() will be called for each child once. Even worse, a linear search is performed on the child names to find the one that matches the 'lastName' value compared with equalsIgnoreCase(). Further, by using the FilenameFilter, the whole entry names are searched instead of stopping at the first that matches the name. The execution time of UnixFileNatives#fetchFileInfo() can be improved significantly when the parent's child names are cached between subsequent calls. The problem here is that the method is static, and there is no direct way to detect that two calls are executed within the same refresh action without breaking API. To prove the effect, I have tried the following: The previous array of child names and the previous directory are remembered in a cache instance. This is potentially leaky, since the state is never freed. This is prevented by holding the cache with a SoftReference, which allows that this cache is GC'ed. Further, the timestamp of the last call is remembered. If 2 calls happen in a short time period, it is very likely that they are part of a single refresh action and that the user is not interested in resource changes that happen on the disk during the 2 calls. The cache does the same as File#list(), only remembering the last result within a small time frame. With the cache active, the time spent in UnixFileNatives#fetchFileInfo() went down from 7.987ms to 1.098ms, that's factor 7.3 ! File#list() calls went down from 43.618 to 17.459 (factor 9.6). This can be even slightly improved by using the cache also in LocalFile#childNames(). This further reduces the call of File#info().
Created attachment 255068 [details] Screenshot: YourKit - UnixFileNatives#fetchFileInfo with change
Pushed for review https://git.eclipse.org/r/#/c/51626/
Thanks for reporting, Karsten! Based on the above information the cause is the same as in bug 470153. *** This bug has been marked as a duplicate of bug 470153 ***
Welcome. I'll test it again with the change when having some time.