Community
Participate
Working Groups
test repo: git://git.eclipse.org/gitroot/e4/eclipse.platform.ui.git Looking at the platform ui converted repo, I see that the cvs2git tool leaves some of our tags and branches probably not in the state we want, and our maintenance branches definitely not in the state we want. ex: R3_6, and R3_6_maintenance A fix for "Bug 315532. [Workbench] ClassCastException: org.eclipse.ui.testing.Contribut" was the last 3.6 commit from our team. We tag all of our projects as R3_6, and then as fixes needed to go in we branched some of the projects from that tag, like org.eclipse.ui.workbench but not org.eclipse.core.commands, for example. In the repo, commit 26d03f83969d9f7c60cf8f985f48dc1400473de4 is the last 3.6.0 commit corresponding to that fix, and is in the tree for master. Good. What we would want is that commit tagged as R3_6, and for that to be the branch point for R3_6_maintenance commits. Instead of tagging that commit as R3_6, cvs2svn created a child commit aacd5a8824d50320c238df64b87e3ad3c04b0923 off of the fix, deleted all files/projects that weren't tagged with R3_6, and then tagged that. It's a leaf commit whose only purpose is to represent R3_6. cvs2svn created another child commit a305cecb492d6d12513e5c0ba338bc9f51f0dc0a off of the fix, deleted all projects that didn't have an R3_6_maintenance branch, and then used this commit as the parent of the R3_6_maintenance commits. Because of this behaviour, R3_6_maintenance is even worse. It contains 12 out of 42 projects. The tool is doing what it is supposed to do, as its representations are accurate. But it leaves the system in an unintuitive state. If we were to work on R3_6_maintenance, I would expect to be able to see all of our projects on the branch, where most of them would be the same if I were to check out R3_6. What can we do here? I don't see a way to leave the system in this state. Is there an accepted way to "fix" these kind of tag/branch problems. Either forget about the deletes move the R3_6 tag, and reparent the R3_6_maintenance commits? Or simply reparent the R3_6_maintenance commits off of the R3_6 generates delete commit (but then R3_6 won't be on master). I'm open to suggestions. PW
Are we doing the conversion off of a copy of our CVS repository? If yes, one experiment that we could do is branching the remaining projects off for 3_6_maintenance as well, and tagging projects with R3_6 that weren't tagged (all of this on the CVS side, to the copied repository), and then trying to convert one more time.
(In reply to comment #1) > Are we doing the conversion off of a copy of our CVS repository? If yes, one > experiment that we could do is branching the remaining projects off for > 3_6_maintenance as well, and tagging projects with R3_6 that weren't tagged > (all of this on the CVS side, to the copied repository), and then trying to > convert one more time. That might be one option, but that will involve a fair bit of manual handwaving. We would need to find each tag that we care about, find its matching date/time, check out everything from CVS based on timestamp, tag everything not already tagged in each case (like the e4 bundles that graduated with 4.0), and branch them all if they weren't already at the branch point. PW
And keep in mind that this doesn't just apply to 3.6 but to all the previous maintenance branches we want to convert....like 10 years worth of them :-) I'm going to write a note to the cvs2git mailing list about this once I have a few minutes after dealing with the Indigo release stuff.
I'm looking through the cvs2git tool's issue tracker to see if any of the open issues match this problem, some seem similar http://cvs2svn.tigris.org/issues/buglist.cgi?Submit+query=Submit+query&component=cvs2svn&issue_status=NEW&issue_status=STARTED&issue_status=REOPENED&email1=&emailtype1=exact&emailassigned_to1=1&email2=&emailtype2=exact&emailreporter2=1&issueidtype=include&issue_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short_desc=&short_desc_type=fulltext&long_desc=&long_desc_type=fulltext&issue_file_loc=&issue_file_loc_type=fulltext&status_whiteboard=&status_whiteboard_type=fulltext&keywords=&keywords_type=anytokens&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=Reuse+same+sort+as+last+time
We're looking at cvs2svn to see if it can be told not to generate those "manufactured commits" . But the git commands to fix my R3_6_maintenance example are: The fix on master was commit 26d03f83969d9f7c60cf8f985f48dc1400473de4 Fix the R3_6 tag: git tag -f R3_6 26d03f83969d9f7c60cf8f985f48dc1400473de4 Fix the R3_6_maintenance branch: git rebase --onto 26d03f83969d9f7c60cf8f985f48dc1400473de4 \ a305cecb492d6d12513e5c0ba338bc9f51f0dc0a The rebase replays all of the R3_6_maintenance commits on top of the real commit on master. from http://stackoverflow.com/questions/3810348/setting-git-parent-pointer-to-a-different-parent PW
As for changing the tool, here's a patch we can try to see if that gets rid of the delete commits (adapted from http://permalink.gmane.org/gmane.comp.version-control.subversion.cvs2svn.devel/2711 Issue 3) ---- Index: cvs2svn_lib/dvcs_common.py =================================================================== --- cvs2svn_lib/dvcs_common.py (revision 5360) +++ cvs2svn_lib/dvcs_common.py (working copy) @@ -278,10 +278,9 @@ source_node = self._mirror.get_old_lod_directory(source_lod, svn_revnum) except KeyError: raise InternalError('Source %r does not exist' % (source_lod,)) - return ( - set([cvs_symbol.cvs_file for cvs_symbol in cvs_symbols]) - == set(self._get_all_files(source_node)) - ) + set1 = set([cvs_symbol.cvs_file for cvs_symbol in cvs_symbols]) + set2 = set(self._get_all_files(source_node)) + return (set1 < set2) def _get_all_files(self, node): """Generate all of the CVSFiles under NODE.""" ---- I haven't tried it yet, but for the simple tag and branch cases it should not create the delete commits. Downside: We can't run the verification step without modifying that as well. PW
A work in progress for moving tags on "manufactured by cvs2svn" commits back onto their parent. #!/bin/bash # TAG=R3_6 T=$( git log -1 --format="format:%H:%an" $TAG ) if echo "$T" | grep cvs2svn >/dev/null; then CMT=$( echo $T | cut -f1 -d: ) echo Found cvs2svn $CMT if [ -z "$(git log -1 --merges --format="format:%h" $TAG)" ]; then PT=$( git log -1 --format="format:%an" "${CMT}^1" ) if ! echo "$PT" | grep cvs2svn >/dev/null; then echo git tag -f "$TAG" "${CMT}^1" fi fi fi
Based on further analysis of what the cvs2git script does, I think that we need to: 1) run the cvs2git script and then run the verification. 2) pick the set of branches we care about, find their correct commits, and move the tag using "git tag -f". In Platform UI, I'd think, R3_1 R3_1_1 R3_1_2 R3_2 R3_2_1 R3_2_2 R3_3 R3_3_1 R3_3_1_1 R3_3_2 R3_4 R3_4_1 R3_4_2 R3_5 R3_5_1 R3_5_2 R3_6 R3_6_1 R3_6_2 Maybe some of the Root_* branch points. 3) pick the branches we care about, find their starting commits, and re-parent them using "git rebase --onto <newParent> <oldParent> perf_31x perf_32x perf_33x perf_34x R3_1_maintenance R3_2_maintenance R3_3_maintenance R3_4_1_maintenance_patches R3_4_maintenance R3_4_maintenance_patches R3_4_maintenance_patches_2 R3_5_maintenance R3_6_maintenance R3_6_maintenance_patches R4_HEAD That'll make our branches accurate, and we'll just ignore the rest of the manufactured commits. PW
Created attachment 198496 [details] Fix tags report script This script scans through the tags and generates three types of lines: A tag that's probably applied correctly: Tag clean: Bug182059_before_HEAD_rebase commit: 4e9b17ddda7d26513ae89a265af4d131b3c98e8c A tag that can be fixed with a simple command looks like: git tag -f I200060718-0800 92449d402cf8102c457dd97041200694bfeefc07^1 A manufactured commit that includes source from another location looks like: Cherrypick tag: v20110614-1530 commit: 434167c0a45e445c4febc9f5c21246a2caeeccf1 If it's important to fix this tag, a developer would have to track down the "cherrypick" parents and try and resolve the commit manually: git log -1 --format="format:remove:%H:%an:%b" 434167c0a45e445c4febc9f5c21246a2caeeccf1 | grep Cherrypick Cherrypick from master 2010-06-03 14:58:28 UTC Boris Bokowski <bbokowski@eclipse.org> Cherrypick from R4_HEAD 2011-06-14 17:45:22 UTC Paul Webster <pwebster@eclipse.org>
Created attachment 198532 [details] Skip branch commits with parent filter script This will skip the delete commits at the beginning of most branches. But it doesn't do what we want: The commit goes away, but the delete changes are just pulled up into the child commit. PW
(In reply to comment #6) > As for changing the tool, here's a patch we can try to see if that gets rid of > the delete commits (adapted from This was a bad idea, it trashed the repo. PW
One way to make the secondary tags like R3_6_1 and R3_6_2 is to generate a symbol-info.txt file and then modify it to be a symbol-hints.txt file: -0 R3_6_2 tag . .trunk. -0 R3_6_1 tag . .trunk. +0 R3_6_2 tag . R3_6_maintenance +0 R3_6_1 tag . R3_6_maintenance 0 R3_6 tag . .trunk. Then those tags are generated as manufactured commits off of something appropriate in the branch, as opposed to franken-built off of master. PW
From what you've said cvs2git is doing the right thing. The stuff that wasn't tagged really shouldn't be part of the git tag. Even if you remove the 'delete', what version of the missing files do you choose? At the time of the branch, clearly the version on HEAD at that time is appropriate. However as soon as HEAD moves on, won't history diverge? I.e. When doing the maintenance branch build, what version of the non-tagged stuff do you use?
(In reply to comment #13) > Even if you remove the 'delete', what version of the missing files do you > choose? At the time of the branch, clearly the version on HEAD at that time is > appropriate. However as soon as HEAD moves on, won't history diverge? I.e. When > doing the maintenance branch build, what version of the non-tagged stuff do you > use? What our git repo *should* look like: A--R3_6--B--C--master \ ----D--E--R3_6_1--F--G--R3_6_2--H--I--R3_6_maintenance Because we tagged platform UI R3_6 but then only branched 12 of 42 projects, the D commit I don't like deletes 30 projects (but they're supposed to be visible as the R3_6 commit while on R3_6_maintenance). The R3_6 branch point is correct. I'm testing a CVS pre-conditioning. If I make sure that all the projects are tagged R3_6 and all of the projects are branched, cvs2git produces a much better history for the tags and branches I care about (without D as a delete commit). The other thing I need to try is to convert the e4 graduated repo by itself, and then stitch it into the platform ui repo (probably from a different branch). PW
I've pushed my steps and my change scripts to a public repo: git://git.eclipse.org/gitroot/e4/org.eclipse.migration.git I wrote down what I did this time in scripts/cvs2git_prep.txt. Most of my generated files went into eclipse.platform.ui/pass2. I've pushed the temporary Platform UI repo to git://git.eclipse.org/gitroot/platform/eclipse.platform.ui.git ... it looks much better, except I should re-run my id scanner to pick up Chris G :-) PW
OK, pre-conditioning the repo was the best option to get decent looking branches for 3.1-3.6. My scripts and steps are published in git://git.eclipse.org/gitroot/e4/org.eclipse.migration.git PW
Paul, are you doing the pre-conditioning in the CVS master repository? If so do I need to do do anything to precondition the three UI projects ( two if you don't include org.eclipse.ui.forms.examples which does not go into the build and does not have branches ).
(In reply to comment #17) > Paul, are you doing the pre-conditioning in the CVS master repository? If so do > I need to do do anything to precondition the three UI projects ( two if you > don't include org.eclipse.ui.forms.examples which does not go into the build > and does not have branches ). Check out git://git.eclipse.org/gitroot/platform/eclipse.platform.ui.git ... I'm pretty sure I've got all 3 projects in there already. I made them match when I did our projects, like org.eclipse.jface.snippets. PW