Community
Participate
Working Groups
Created attachment 220629 [details] Perf Analysis I'm using EGit with Eclipse Indigo. Whenever I try to do anything git-related from the UI such as committing or deleting files, Eclipse hangs for a couple minutes before finally finishing the operation. This happens for even the smallest of git-related operations. I've attached the output of a performance sampling which shows that the bottleneck is when JGit is walking the filesystem. Not sure you'll be able to reproduce, I might write a little program which just walks my filesystem to see if that runs slowly.
I just made a quick program which walks my git repo. It took 46 seconds to walk 237,280 files When I told it to exclude 'build' directories, it took half a second to walk 14,646 files Weird how the times aren't linear. I'm guessing that the File Tree Iterator is not ignoring certain directories that should be excluded from the walk and this is probably why the performance is so slow.
Yeah, just confirmed my theory. I did a gradle clean which wiped the build directories so that only source was left. Then I did a commit from eclipse and it was super-fast.
My conclusion is that JGit is not respecting the .gitignore file when walking the file hierarchy. Seems like if it only walked the parts of the tree that were not in the ignore file, it would be much faster.
I think the real problem is that IndexDiffFilter is not working. E.g. the contents of bin directories should be ignored by most operations unless we already track content there. Try this: #jgit init #mkdir d #touch d/f #touch f #jgit add f #jgit diff diff --git a/d/f b/d/f new file mode 100644 index 0000000..e69de29 --- /dev/null +++ b/d/f jgit diff should not output anything here, but apparently it decends "d" here. Ignoring directories completely due to patterns may be hard in general. E.g the following two rules would ignore everything in a folder named bin, unless it contains class files at any depth. bin/ !*.class
I am not an expert on file matching patterns, but I think if this logic was added to the filter, performance would be vastly increased (especially with large git repos).
Jens, got an idea of why the IndexDiffFilter does not appear to work as advertised? Is it only usable with IndexDiff?
See https://git.eclipse.org/r/7721 for failing unit test
I think I'm seeing the same problem or a related one: I have a really tiny git repo, but my worktree has some really deep subtrees (some physical and some symlinked, but all are ignored by .gitignore). Commandline git is lightning fast, but when I commit even a single file through egit, it takes in the range of minutes. My ignored subtrees are mostly at the root level, even outside the scope of any Eclipse project that I'm working on.
*** Bug 448774 has been marked as a duplicate of this bug. ***
As far as I see FileTreeIterator indeed iterates over everything. TreeWalk advances the iterator, and only then applies the filter (NotIgnoredFilter, or IndexDiffFilter), and then skips over ignored files. But it still iterates through them, and the FileTreeIterator does a directory.listFiles() for each ignored directory (and subdirectory) and then even a FS.getAttributes() on each file listed. FS.getAttributes() may be a relatively expensive operation, especially on Windows. What one would need is a FileTreeIterator that would skip over ignored directories transparently, without doing the expensive directory listing and getting attributes, unless the directory contained tracked files, in which case the working tree directory still would need to be traversed, but only for the tracked paths.
New Gerrit change created: https://git.eclipse.org/r/120337
Gerrit change https://git.eclipse.org/r/120337 was merged to [master]. Commit: http://git.eclipse.org/c/jgit/jgit.git/commit/?id=d7deda98d0a18ca1e3a1fbb70acf8e7cbcf25833