Community
Participate
Working Groups
We found several bad results from histogram diff when using it in our gerrit instance: - looks like here http://egit.eclipse.org/r/#patch,unified,1766,3,org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/history/GitHistoryPage.java the new algorithm shows too coarse grained diffs. Large blocks seem to be changed but these blocks only differ in a few lines, so that it's hard to spot what really changed. - This diff also looks odd : http://egit.eclipse.org/r/#patch,unified,1783,2,org.eclipse.egit.core/pom.xml look at the block containing <artifactId>maven-osgi-source-plugin</artifactId> - Here is another case where the new diff fails: instead of showing just the addition of the method testDeleteVsModify() a large block is shown to be changed compared to the old version http://egit.eclipse.org/r/#patch,sidebyside,1809,1,org.eclipse.jgit.test/tst/org/eclipse/jgit/merge/MergeAlgorithmTest.java This problems were originally reported in the mail thread http://dev.eclipse.org/mhonarc/lists/egit-dev/msg01560.html There was one more problem mentioned in that thread which has already been fixed. Creating this bug to make tracking this easier
I think I understand the problem. When we found the initial LCS for a file, we started the LCS region on a blank line. This occurs often in the file, and thus has a count that is higher than the count for the line "\t/**\n". (It is "less unique".) When we later discover the insertion of "\t/**\n" at the start of the new method, we select this as our LCS because it has a lower count than the blank line we initially set the LCS on earlier. The problem here is, we aren't looking to see if any of the common lines within the prior LCS region have a lower count than the initial line. We just used the initial line's count and said "good enough". But then we apply this optimization where we don't even look at the other lines in the LCS, and thus we never consider if they have a lower count. We need to use the lowest count within the LCS to set the count for the LCS.
(In reply to comment #0) > We found several bad results from histogram diff I think http://egit.eclipse.org/r/1829 fixes the issues you identified on the gerrit instance. Unfortunately there is still a problem with the output, we aren't getting a very clean diff for this case: @@ -188,6 +189,16 @@ public void testSameModification() throws IOException { assertEquals(replace_C_by_Z, merge(base, replace_C_by_Z, replace_C_by_Z)); + } + + /** + * Check that a deleted vs. a modified line shows up as conflict (see Bug + * 328551) + * + * @throws IOException + */ + public void testDeleteVsModify() throws IOException { + assertEquals(A+B+XXX_0+XXX_1+Z+XXX_2+D+E+F+G+H+I+J, merge(base, delete_C, replace_C_by_Z)); } private String merge(String commonBase, String ours, String theirs) throws IOException { The "\t}\n" and "\n" initial added lines should be at the end of the new testDeleteVsModify method.
(In reply to comment #2) > > Unfortunately there is still a problem with the output, > we aren't getting a very clean diff for this case: > > @@ -188,6 +189,16 @@ > public void testSameModification() throws IOException { > assertEquals(replace_C_by_Z, > merge(base, replace_C_by_Z, replace_C_by_Z)); > + } > + This bad case is fixed by http://egit.eclipse.org/r/1831
Fixed, and a new Gerrit deployed to egit.eclipse.org.