Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 353779

Summary: Move TCF to git
Product: [Tools] TCF Reporter: Doug Schaefer <cdtdoug>
Component: CoreAssignee: Doug Schaefer <cdtdoug>
Status: RESOLVED FIXED QA Contact: Eugene Tarassov <eugene>
Severity: normal    
Priority: P3 CC: cdtdoug, felix.burton, jamesblackburn+eclipse, mober.at+eclipse, uwe.st
Version: unspecified   
Target Milestone: 0.5.0   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Bug Depends on: 359239    
Bug Blocks:    

Description Doug Schaefer CLA 2011-08-03 10:45:27 EDT
This bug tracks the discussion and actions we need to perform to get TCF to git. Here are the things we need to do.

- Close on the conversion strategy and in particular, which tool do we use.

- Do we want to break up the TCF repo into multiple ones. The agent is a good example since I know a number of projects that would like to import it as a submodule.

- Once converted, we need to get Hudson builds going for the Java bits that are used by CDT's EDC.

Any more?
Comment 1 Doug Schaefer CLA 2011-08-03 10:47:05 EDT
BTW, I'd like to get this finished by the end of next week, Aug 12.
Comment 2 Uwe Stieber CLA 2011-08-03 10:56:02 EDT
> - Close on the conversion strategy and in particular, which tool do we use.

What do you mean by "which tool do we use"? Can you give some background?

>- Once converted, we need to get Hudson builds going for the Java bits that are
>used by CDT's EDC.

Assuming that we will have Hudson builds for all the Java bits, no? At least both the TCF Framework bits and the Target Explorer bits should to be build by Hudson. Don't we go and execute the top pom.xml from our repository root?
Comment 3 Doug Schaefer CLA 2011-08-03 11:17:23 EDT
(In reply to comment #2)
> > - Close on the conversion strategy and in particular, which tool do we use.
> 
> What do you mean by "which tool do we use"? Can you give some background?

CDT used a tool called cvs2git. I believe there is a svn2git as well. Or we could just use git-svn.

> >- Once converted, we need to get Hudson builds going for the Java bits that are
> >used by CDT's EDC.
> 
> Assuming that we will have Hudson builds for all the Java bits, no? At least
> both the TCF Framework bits and the Target Explorer bits should to be build by
> Hudson. Don't we go and execute the top pom.xml from our repository root?

Yes, our current maven set up should work, I just need to get it into a Hudson job and co-ordinate it with the CDT builds, i.e. cdt, then tcf, then cdt-edc.
Comment 4 Felix Burton CLA 2011-08-04 20:05:57 EDT
(In reply to comment #3)
> (In reply to comment #2)
> > > - Close on the conversion strategy and in particular, which tool do we use.
> > 
> > What do you mean by "which tool do we use"? Can you give some background?
> 
> CDT used a tool called cvs2git. I believe there is a svn2git as well. Or we
> could just use git-svn.

I have been using git-svn with good success.  In fact, I tried it on the tcf earlier today and it work just fine using the following command:

  git svn clone -Ttrunk -ttags -bbranches svn://dev.eclipse.org/svnroot/dsdp/org.eclipse.tm.tcf

After that I did the following commands to make the branches visible to cloners of the git repository

  git checkout -t -b 0.4.0 remotes/0.4.0 
  git checkout -t -b 0.3.0 remotes/tags0.3.0 
  git checkout -t -b 0.3.0 remotes/tags/0.3.0 
  git checkout -t -b initial remotes/tags/initial
  git checkout -t -b CDT_8_0 remotes/tags/CDT_8_0

And finally I would run the following command to get ignore information from svn

  git svn create-ignore

Note that it does not check-in the .gitignore files, so you have to do that manually.  Also note that it overwrites any existing .gitignore files, so you should probably merge them.
Comment 5 Felix Burton CLA 2011-08-04 20:16:13 EDT
With SVN it was possible to checkout parts of the repository, for example to only get the agent code, but with GIT it will not be possible anymore.  I am not sure how big of an issue this is, but if we want to do a split then now is the time.

I see the following natural boundaries:

1. C Implementation
2. Java implementation
3. python implementation
4. docs
5. target explorer

There is some overhead with having multiple repositories, so we might want to be conservative and only split the big stuff, i.e 1, 2 and 5.  3 and 4 could be part of 2.

Thoughts?
Felix
Comment 6 Uwe Stieber CLA 2011-08-05 02:15:03 EDT
(In reply to comment #5)

+1 for 1, [2-4], 5. I mean we could leave 5 with [2-4], but I agree that the TE is some what independent from [2-4].

(In reply to comment #4)

+1 for "git svn".

Doug, can you send out a "SVN Repo Closed" message before you run the import for real? We have ongoing commits to the SVN repository and to avoid loosing anything while switching to Git, we should know when no commits should go to SNV anymore.
Comment 7 Doug Schaefer CLA 2011-08-05 10:42:26 EDT
From my experience it's always easiest to have the fewest git repos. It's just easier to manage branches and builds.

The only requirement I've seen is for the C agent to be separated out. And even there I'm not very happy about that. The agent should have a plug-in architecture and be able to be built independently of the client software that incorporates it, you know, like a library. But it is what it is for now.

We can always split the repo later if we get blocked down the road.

I will send a notice out Monday, and we'll do the conversion Friday.
Comment 8 Doug Schaefer CLA 2011-08-05 11:57:14 EDT
(In reply to comment #4)

Thanks Felix! These commands will be very helpful.
Comment 9 James Blackburn CLA 2011-09-20 10:35:11 EDT
(In reply to comment #4)
>   git svn clone -Ttrunk -ttags -bbranches
> svn://dev.eclipse.org/svnroot/dsdp/org.eclipse.tm.tcf

I track TCF using the above, which I believe can be simplified to:

git svn clone -s svn://dev.eclipse.org/svnroot/dsdp/org.eclipse.tm.tcf

It seems that git-svn makes tags into branches, so you'll need to manually reapply the tag.


IMO Doug's right, fewer git repos are currently easier (egit, for one, doesn't support git submodules...).  You can easily split out parts of the repository at a later date, however stitching repositories together is hard...

Having imported tcf and repacked, you end up with a reasonably sized repo:
  git repack -a -d --depth=250 --window=250
  git gc --aggressive
  du -sh .git
    13M

I guess you're doing this as we speak, as I notice the test tcf repository has gone away ;)
Comment 10 Doug Schaefer CLA 2011-09-20 10:43:35 EDT
I'm following this:

http://wiki.eclipse.org/Git/Migrating_to_Git#Using_svn2git_on_a_remote_server

One thing I'm planning on doing is splitting out the C agent directory into it's own repo. A typical scenario I've seen is a natural case to use it as a submodule. This will require some filter-branch magic.

And yes, I'm playing with different scenarios right now. I'll send a note to tcf-dev when I have something you can play with.
Comment 11 James Blackburn CLA 2011-09-20 10:50:09 EDT
(In reply to comment #10)
> A typical scenario I've seen is a natural case to use it as a
> submodule.

Except egit doesn't yet do submodules ;)


I guess from my POV if the two can be built independently they they're naturally separate modules. If I need to have both (and subsequently I need to create mirrors to track two repositories here) then it's less clear that there's value in having a separate repository...
Comment 12 Doug Schaefer CLA 2011-09-20 10:56:38 EDT
In the scenarios I'm talking about, people aren't really using Eclipse either. Using the agent as a submodule should be purely optional.\

And submodule support for egit is gathering momentum.
Comment 13 Doug Schaefer CLA 2011-09-20 11:22:35 EDT
Interesting anecdote: Turns out the C agent was about half of the repo. Split right down the middle at 12MB.
Comment 14 James Blackburn CLA 2011-09-20 11:33:45 EDT
(In reply to comment #13)
> Interesting anecdote: Turns out the C agent was about half of the repo. Split
> right down the middle at 12MB.

Is that 12MB each, or in total? I got 13MB all-in when I just did the import using git-svn.
Comment 15 Doug Schaefer CLA 2011-09-20 11:37:18 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > Interesting anecdote: Turns out the C agent was about half of the repo. Split
> > right down the middle at 12MB.
> 
> Is that 12MB each, or in total? I got 13MB all-in when I just did the import
> using git-svn.

You may be missing stuff, which is why svn2git exists apparently.
Comment 16 Doug Schaefer CLA 2011-09-20 11:37:26 EDT
Any suggestions on what we should do with the server directory? It seems more related to the agent. I could create a third repo for it.

I just want to make sure we don't use submodules in the main repo until egit supports them. Right now the server makefile references the agent at ../agent. Splitting it out would make it easier to set this up.
Comment 17 Eugene Tarassov CLA 2011-09-20 11:52:51 EDT
(In reply to comment #16)
> Any suggestions on what we should do with the server directory? It seems more
> related to the agent. I could create a third repo for it.
> 
> I just want to make sure we don't use submodules in the main repo until egit
> supports them. Right now the server makefile references the agent at ../agent.
> Splitting it out would make it easier to set this up.

Server directory should go into the agent repository. The server is built from same source as the agent. Essentially, it is just a role the agent can play, it is not really a separate code base.
Comment 18 Doug Schaefer CLA 2011-09-20 12:00:12 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > Interesting anecdote: Turns out the C agent was about half of the repo. Split
> > right down the middle at 12MB.
> 
> Is that 12MB each, or in total? I got 13MB all-in when I just did the import
> using git-svn.

Never mind. Appears just doing a gc brings it down to 12+MB :p
Comment 19 Doug Schaefer CLA 2011-09-20 12:03:45 EDT
(In reply to comment #17)
> (In reply to comment #16)
> > Any suggestions on what we should do with the server directory? It seems more
> > related to the agent. I could create a third repo for it.
> > 
> > I just want to make sure we don't use submodules in the main repo until egit
> > supports them. Right now the server makefile references the agent at ../agent.
> > Splitting it out would make it easier to set this up.
> 
> Server directory should go into the agent repository. The server is built from
> same source as the agent. Essentially, it is just a role the agent can play, it
> is not really a separate code base.

Thanks Eugene. I'm attempting to move the server directory into the agent directory using git filter-branch since that becomes the top directory in the new agent repo.
Comment 20 Doug Schaefer CLA 2011-09-20 12:21:47 EDT
OK, test repos are ready. Please let me know what you think.

http://git.eclipse.org/c/tcf/test/org.eclipse.tcf.git/
http://git.eclipse.org/c/tcf/test/org.eclipse.tcf.agent.git/

The server directory moved nicely into agent and history is all good. Love git filter-branch!
Comment 21 Eugene Tarassov CLA 2011-09-21 13:03:52 EDT
(In reply to comment #20)
> OK, test repos are ready. Please let me know what you think.
> 
> http://git.eclipse.org/c/tcf/test/org.eclipse.tcf.git/
> http://git.eclipse.org/c/tcf/test/org.eclipse.tcf.agent.git/
> 
> The server directory moved nicely into agent and history is all good. Love git
> filter-branch!

It looks like tags were not imported properly. git clone reports errors:

error: refs/tags/CDT_8_0 does not point to a valid object!
error: refs/tags/0.3.0 does not point to a valid object!

Also, .gitignore appears missing/broken in both repos.
"git svn create-ignore" needs to be done during the convertion.
Comment 22 Doug Schaefer CLA 2011-09-21 13:23:33 EDT
I'll take a look. Interesting about the tags. Egit was able to clone it including the tags without a problem.
Comment 23 James Blackburn CLA 2011-09-21 13:46:54 EDT
(In reply to comment #21)
> Also, .gitignore appears missing/broken in both repos.
> "git svn create-ignore" needs to be done during the convertion.

Doug didn't migrate using git-svn.  I'm guessing you'll need to add the gitignore manually.
Comment 24 Eugene Tarassov CLA 2011-09-21 14:02:57 EDT
(In reply to comment #23)
> (In reply to comment #21)
> > Also, .gitignore appears missing/broken in both repos.
> > "git svn create-ignore" needs to be done during the convertion.
> 
> Doug didn't migrate using git-svn.  I'm guessing you'll need to add the
> gitignore manually.

How was it done?
Doug, could you post a script?
Comment 25 Doug Schaefer CLA 2011-09-21 14:04:32 EDT
Figured out the tags thing. It's left overs from the filter-branch. This article saved the day, especially the last comment about removing lines from info/refs and packed-refs.

    http://stackoverflow.com/questions/1216733/remove-a-directory-permanently-from-git
Comment 26 Doug Schaefer CLA 2011-09-21 14:06:38 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #21)
> > > Also, .gitignore appears missing/broken in both repos.
> > > "git svn create-ignore" needs to be done during the convertion.
> > 
> > Doug didn't migrate using git-svn.  I'm guessing you'll need to add the
> > gitignore manually.
> 
> How was it done?
> Doug, could you post a script?

I'm following this:

http://wiki.eclipse.org/Git/Migrating_to_Git#Using_svn2git_on_build.eclipse.org

svn2git doesn't seem to have a create-ignore feature. And Felix's remarks about having to merge with pre-existing .gitignore files has me worried about doing anything automatically. It's probably worth understanding where the .gitignores need to be and manually updating them.
Comment 27 Doug Schaefer CLA 2011-09-21 15:05:46 EDT
Another reason not to rely on create-ignore, it creates too many of them. Looks like the subversion eclipse plug-ins creates one for every project containing the same thing. gitignore is more powerful that way.
Comment 28 Eugene Tarassov CLA 2011-09-21 15:13:35 EDT
(In reply to comment #27)
> Another reason not to rely on create-ignore, it creates too many of them. Looks
> like the subversion eclipse plug-ins creates one for every project containing
> the same thing. gitignore is more powerful that way.

I have updated/committed .gitignore files in the svn repo. There are two of them - in top level and agent directories. I've used "git svn show-ignore" for that.
Comment 29 Doug Schaefer CLA 2011-09-28 15:52:16 EDT
For the record, here are the filter branches I'm running on the repos to move server under agent and agent into it's own repo.

Run svn2git to produce git.svn repo. Run official initrepo command to create org.eclipse.tcf.git. git push --mirror from the git.svn repo to the final git one.

Move server into agent:
git filter-branch --tree-filter "mv server agent || true" --tag-name-filter cat --prune-empty -- --all

Create agent repo:
cp -R org.eclipse.tcf.git org.eclipse.tcf.agent.git
cd org.eclipse.tcf.agent.git
git filter-branch --subdirectory-filter agent --tag-name-filter cat --prune-empty -- --all

Back in org.eclipse.tcf.git, remove agent:
git filter-branch --tree-filter "rm -fr server agent" --tag-name-filter cat --prune-empty -- --all

Ran diffs in resulting repo against svn and nothing missing on the branches and tags.
Comment 30 Doug Schaefer CLA 2011-09-28 19:29:33 EDT
(In reply to comment #29)
> Ran diffs in resulting repo against svn and nothing missing on the branches and
> tags.

Actually, I'll revise that. There were a three empty directories that didn't make it over. You can't check in empty directories into git.
Comment 31 Doug Schaefer CLA 2011-09-29 11:48:32 EDT
Conversion complete. Repos can be viewed on-line here:

   http://git.eclipse.org/c/tcf
Comment 32 Martin Oberhuber CLA 2011-09-30 09:06:46 EDT
Are the git:// and ssh:// URL's for the new repo already documented somewhere, like on the CDT contributor's guide
   http://wiki.eclipse.org/Getting_started_with_CDT_development

I think that these should go into a TCF contributor's guide, no? - By trial and error I figured out
   
   git://git.eclipse.org/gitroot/tcf/org.eclipse.tcf.agent.git

but I couldn't get ssh to work ("Auth fail"):

   ssh://moberhuber@git.eclipse.org/gitroot/tcf/org.eclipse.tcf.agent.git