Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 316208 - CDT git migration
Summary: CDT git migration
Status: CLOSED FIXED
Alias: None
Product: CDT
Classification: Tools
Component: cdt-releng (show other bugs)
Version: 7.0   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: cdt-releng-inbox@eclipse.org CLA
QA Contact: Doug Schaefer CLA
URL:
Whiteboard:
Keywords:
Depends on: 303404 316211 316212 348474
Blocks: 345659
  Show dependency tree
 
Reported: 2010-06-08 16:53 EDT by James Blackburn CLA
Modified: 2011-06-24 20:26 EDT (History)
12 users (show)

See Also:


Attachments
cvs2git.options (30.11 KB, application/octet-stream)
2011-05-10 04:06 EDT, James Blackburn CLA
no flags Details
cvs2git.options (31.04 KB, text/plain)
2011-05-10 15:47 EDT, James Blackburn CLA
no flags Details
recipe.txt (2.18 KB, text/plain)
2011-05-10 15:53 EDT, James Blackburn CLA
no flags Details
list of .projects in the converted org.eclipse.cdt (5.98 KB, text/plain)
2011-05-10 16:38 EDT, James Blackburn CLA
no flags Details
largest-files.txt (4.40 KB, text/plain)
2011-05-10 16:52 EDT, James Blackburn CLA
no flags Details
recipe.txt (2.29 KB, text/plain)
2011-05-10 18:12 EDT, James Blackburn CLA
no flags Details
recipe.txt (2.96 KB, text/plain)
2011-05-11 04:00 EDT, James Blackburn CLA
no flags Details
Tags (1.48 KB, text/plain)
2011-05-11 04:02 EDT, James Blackburn CLA
no flags Details
v-tags-to-preserve.txt (2.02 KB, text/plain)
2011-05-11 12:38 EDT, James Blackburn CLA
no flags Details
recipe.txt (3.74 KB, text/plain)
2011-05-11 12:40 EDT, James Blackburn CLA
no flags Details
v-tags-to-preserve.txt (2.12 KB, text/plain)
2011-06-09 06:24 EDT, James Blackburn CLA
no flags Details
recipe.txt (4.24 KB, text/plain)
2011-06-09 06:25 EDT, James Blackburn CLA
no flags Details
recipe.txt (5.97 KB, text/plain)
2011-06-15 16:02 EDT, James Blackburn CLA
no flags Details
recipe.txt (6.13 KB, text/plain)
2011-06-24 12:13 EDT, James Blackburn CLA
no flags Details
v-tags-to-preserve.txt (2.14 KB, text/plain)
2011-06-24 12:15 EDT, James Blackburn CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description James Blackburn CLA 2010-06-08 16:53:25 EDT
We've been having a discussion on the cdt-dev mailing list about moving the CDT central repositories to git.

The upshots of moving to a DVCS are many.  It should make it easier for downstream users to clone the code, change and track upstream and submit patches that are easy to regenerate when upstream head moves on.  It'll certainly make the lives of downstream products based on CDT forks easier...

This a central bug to catalog the issues that need to be resolved before we can get agreement to switch.
Comment 1 Doug Schaefer CLA 2011-05-09 22:58:35 EDT
James, I tried your repo on github. Is it possible to filter out all directories with /old/ in them? These are things we don't care about in recent history if ever.

Also, is there any way to prune the history? Say for example we declare that the CDT started with the CDT 5.0 drop and remove all versions and branches before that. If anyone wants history before that we'll have the CVS repos available in archive.
Comment 2 Doug Schaefer CLA 2011-05-09 23:09:19 EDT
contrib and util would be other candidates for not moving over.
Comment 3 Doug Schaefer CLA 2011-05-09 23:10:34 EDT
and cppunit :).
Comment 4 Andrew Gvozdev CLA 2011-05-09 23:47:33 EDT
(In reply to comment #1)
> Also, is there any way to prune the history? Say for example we declare that the
> CDT started with the CDT 5.0 drop and remove all versions and branches before
> that. If anyone wants history before that we'll have the CVS repos available in
> archive.
I don't get why we want to prune the history, often it is invaluable. But if we do, maybe we can rather have 2 Git repositories, one with full history and another one pruned? I would *much* prefer to work with the full history in one repository.
Comment 5 James Blackburn CLA 2011-05-10 04:06:39 EDT
Created attachment 195181 [details]
cvs2git.options

Recipe for conversion.

Options used for the conversion attached.

Steps to reproduce:
1) Grab a full copy of the tools repository:
   wget http://archive.eclipse.org/arch/tools-cvs.tgz

Create a directory to work in.

2) Unpack CDT bits

tar -zxvf .../tools-cvs.tgz cvs/tools/org.eclipse.cdt cvs/tools/org.eclipse.cdt-build cvs/tools/org.eclipse.cdt-contrib cvs/tools/org.eclipse.cdt-core cvs/tools/org.eclipse.cdt-cppunit cvs/tools/org.eclipse.cdt-debug cvs/tools/org.eclipse.cdt-doc cvs/tools/org.eclipse.cdt-launch cvs/tools/org.eclipse.cdt-old cvs/tools/org.eclipse.cdt-releng cvs/tools/CVSROOT

3) Move the externally stored CDT components into place:
pushd cvs/tools/
mv org.eclipse.cdt-build/ org.eclipse.cdt/build
mv org.eclipse.cdt-contrib/ org.eclipse.cdt/contrib
mv org.eclipse.cdt-core/ org.eclipse.cdt/core
mv org.eclipse.cdt-cppunit/ org.eclipse.cdt/cppunit
mv org.eclipse.cdt-debug/ org.eclipse.cdt/debug
mv org.eclipse.cdt-doc/ org.eclipse.cdt/doc
mv org.eclipse.cdt-launch/ org.eclipse.cdt/launch
mv org.eclipse.cdt-old/ org.eclipse.cdt/old
mv org.eclipse.cdt-releng/* org.eclipse.cdt/releng/
popd


4) Remove broken symlinks in the repo (from all to the CDT components we moved above)
   find cvs/ -type l|xargs -n 1 rm

5) Run cvs2git to do the conversion:
   http://cvs2svn.tigris.org/cvs2git.html

5.1) Get latest cvs2git:
    svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk

5.2) .../cvs2svn/cvs2git --options=cvs2git.options
    (options file attached)

6) Import the cvs2git output into git

  mkdir org.eclipse.cdt
  cd org.eclipse.cdt
  git init
  cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import

7) Move tags into place
  python ..../cvs2svn/contrib/git-move-refs.py

A full conversion, includincvs2svn Statistics:
------------------
Total CVS Files:             24753
Total CVS Revisions:        150689
Total CVS Branches:         124784
Total CVS Tags:            9305400
Total Unique Tags:            1164
Total Unique Branches:          27
CVS Repos Size in KB:       703716
Total SVN Commits:           27098
First Revision Date:    Mon Sep 10 23:15:52 2001
Last Revision Date:     Sat Apr 30 19:20:53 2011

The size of the git repository is 250M. The size of the checkout 257M (total size: 507M).
Comment 6 James Blackburn CLA 2011-05-10 04:21:26 EDT
(In reply to comment #1)
> Is it possible to filter out all
> directories with /old/ in them? These are things we don't care about in recent
> history if ever.
> 
> Also, is there any way to prune the history?

We can do both of these if wanted. I thought I would just do the whole lot first to see what the upper bound is...

Obvious things to prune are anything not included in the most recent releng tag:
 - build/old
 - contrib/
 - core/old
 - debug/old
 - old/
 - releng/old
 - cppunit/

If you think of anything else please add...

(In reply to comment #4)
> I don't get why we want to prune the history, often it is invaluable. But if we
> do, maybe we can rather have 2 Git repositories, one with full history and
> another one pruned? I would *much* prefer to work with the full history in one
> repository.

Well there's stuff that isn't built, tested, or released. I agree we should be careful about preserving mainline history. But experimental stuff that was dropped can likely just go.


Authors:

Using bugzilla I reconstructed author names and emails. The list looks like this for (recent-ish) committers.

Author: Alain Magloire <alain@qnx.com>
Author: Alena Laskavaia <elaskavaia.cdt@gmail.com>
Author: Andrew Ferguson <andrew.ferguson@symbian.com>
Author: Andrew Gvozdev <angvoz.dev@gmail.com>
Author: Andrew Niefer <aniefer@ca.ibm.com>
Author: Anton Leherbauer <anton.leherbauer@windriver.com>
Author: Bogdan Gheorghe <gheorghe@ca.ibm.com>
Author: Chris Recoskie <recoskie@ca.ibm.com>
Author: Chris Wiebe <cwiebe@ftml.net>
Author: David Daoust <dave.daoust@windriver.com>
Author: David Dubrow <david.dubrow@nokia.com>
Author: David Inglis <dinglis@qnx.com>
Author: Doug Schaefer <doug.schaefer@windriver.com>
Author: Ed Swartz <ed.swartz@nokia.com>
Author: Emanuel Graf <egraf@hsr.ch>
Author: Francois Chouinard <fchouinard@gmail.com>
Author: Hoda Amer <hamer@ca.ibm.com>
Author: James Blackburn <jamesblackburn+eclipse@gmail.com>
Author: Jason Montojo <jason.montojo@gmail.com>
Author: John Camelon <jcamelon@ca.ibm.com>
Author: John Cortell <john.cortell@freescale.com>
Author: Judy N. Green <jgreen@qnx.com>
Author: Ken Ryall <ken.ryall@nokia.com>
Author: L. Frank Turovich <frank.turovich@nokia.com>
Author: Leo Treggiari <leo.treggiari@intel.com>
Author: Ling Wang <ling.5.wang@nokia.com>
Author: Marc Khouzam <marc.khouzam@ericsson.com>
Author: Marc-Andre Laperle <malaperle@omnialabs.net>
Author: Markus Schorn <markus.schorn@windriver.com>
Author: Martin Lescuyer <mlescuyer@rational.com>
Author: Mike Kucera <mkucera@ca.ibm.com>
Author: Mikhail Khodjaiants <mikhailkhod@googlemail.com>
Author: Mikhail Sennikovsky <mikhail.sennikovskiy@gmail.com>
Author: Norbert Plött <norbert.ploett@siemens.com>
Author: Oleg Krasilnikov <oleg.krasilnikov@intel.com>
Author: Patrick Chuong <pchuong@ti.com>
Author: Pawel Piech <pawel.piech@windriver.com>
Author: Peter Graves <pgraves@qnx.com>
Author: Randy Rohrbach <Randy.Rohrbach@Windriver.com>
Author: Sean Evoy <sevoy@ca.ibm.com>
Author: Sebastien Marineau <sebastien@qnx.com>
Author: Sergey Prigogin <eclipse.sprigogin@gmail.com>
Author: Tanya-Marise De Sousa <tdesous@ca.ibm.com>
Author: Ted Williams <ted@ted.net>
Author: Teodor Madan <teodor.madan@freescale.com>
Author: Thomas Fletcher <thomasf@qnx.com>
Author: Vivian Kong <vivkong@ca.ibm.com>
Author: Vladimir Hirsl <vhirsl@ca.ibm.com>
Author: Warren Paul <warren.paul@nokia.com>

The following aren't listed on dash as past committers. I'll grovel in the eclipse mirror to see if the UIDs are mapped.

Author: bfreeman <>
Author: boxall <>
Author: cecco <>
Author: chanskw <>
Author: dmcknigh <>
Author: enriquev <>
Author: eyasser <>
Author: jduimovich <>
Author: jhandcock <>
Author: khapitas <>
Author: kseitz <>
Author: mkwan <>
Author: rmoseley <>
Author: thomson <>
Author: turnham <>
Author: uid8941 <uid8941>
Author: weisz <>
Comment 7 James Blackburn CLA 2011-05-10 04:47:30 EDT
Bundles we propose to exclude in the conversion:

bash:jamesb:xl-cbga-20:32924> find . -type f -name .project |grep old
./build/old/org.eclipse.cdt.make-feature/.project
./build/old/org.eclipse.cdt.managedbuilder-feature/.project
./build/old/org.eclipse.cdt.managedbuilder.msvc.core/.project
./core/old/org.eclipse.cdt.pdom.core/.project
./core/old/org.eclipse.cdt.pdom.ui/.project
./debug/old/org.eclipse.cdt.debug.win32.core/.project
./debug/old/org.eclipse.cdt.debug.win32.ui/.project
./old/android/org.eclipse.cdt.android.core/.project
./old/android/org.eclipse.cdt.android.ui/.project
./old/android2/org.eclipse.cdt.android.build.core/.project
./old/android2/org.eclipse.cdt.android.build.ui/.project
./old/android2/org.eclipse.cdt.android.debug.core/.project
./old/android2/org.eclipse.cdt.android.debug.ui/.project
./old/android2/org.eclipse.cdt.android.feature/.project
./old/build2/org.eclipse.cdt.build.core/.project
./old/build2/org.eclipse.cdt.build.ui/.project
./old/mylyn/org.eclipse.cdt.mylyn-feature/.project
./old/mylyn/org.eclipse.cdt.mylyn.ui/.project
./old/mylyn/org.eclipse.cdt.mylyn/.project
./old/old2/cdt-home/.project
./old/old2/org.eclipse.cdt.build.core.tests/.project
./old/old2/org.eclipse.cdt.build.core/.project
./old/old2/org.eclipse.cdt.build.gnu.core/.project
./old/old2/org.eclipse.cdt.build.ui/.project
./old/old2/org.eclipse.cdt.mingw-feature/.project
./old/old2/org.eclipse.cdt.mingw.build/.project
./old/old2/org.eclipse.cdt.mingw.debug/.project
./old/old2/org.eclipse.cdt.mingw/.project
./old/old2/org.eclipse.cdt.windows.debug.core.cdi/.project
./old/old2/org.eclipse.cdt.windows.debug.native/.project
./old/org.eclipse.cdt.old/cdt-cpp-extensions-home/.project
./old/org.eclipse.cdt.old/com.ibm.debug.common/.project
./old/org.eclipse.cdt.old/com.ibm.debug.daemon/.project
./old/org.eclipse.cdt.old/com.ibm.debug.pdt/.project
./old/org.eclipse.cdt.old/com.ibm.lpex/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.docs.user/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.miners.parser/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.miners/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.ui/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.debug.gdbPicl/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.core/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.extra.server/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.extra/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.hosts/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.miners/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.ui/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.linux.help/.project
./old/org.eclipse.cdt.old/org.eclipse.cdt.pa.ui/.project
./old/windows/old/org.eclipse.cdt.windows-feature/.project
./old/windows/old/org.eclipse.cdt.windows.build/.project
./old/windows/old/org.eclipse.cdt.windows.debug.cdi.core/.project
./old/windows/old/org.eclipse.cdt.windows.debug.cdi.ui/.project
./old/windows/old/org.eclipse.cdt.windows.debug.core/.project
./old/windows/old/org.eclipse.cdt.windows.debug.native/.project
./old/windows/old/org.eclipse.cdt.windows.debug.tests/.project
./old/windows/old/org.eclipse.cdt.windows/.project
./old/windows/old/really_old/org.eclipse.cdt.windows.debug.core/.project
./old/windows/old/really_old/org.eclipse.cdt.windows.debug.debugger/.project
./old/windows/old/really_old/org.eclipse.cdt.windows.debug.tests.app/.project
./old/windows/old/really_old/org.eclipse.cdt.windows.debug.tests/.project
./old/windows/old/really_old/org.eclipse.cdt.windows.debug.ui/.project
./old/windows/org.eclipse.cdt.csharp.build/.project
./old/windows/org.eclipse.cdt.csharp.core.tests/.project
./old/windows/org.eclipse.cdt.csharp.core/.project
./old/windows/org.eclipse.cdt.csharp.msw.build/.project
./old/windows/org.eclipse.cdt.csharp.ui/.project
./old/windows/org.eclipse.cdt.msw.debug.core.tests/.project
./old/windows/org.eclipse.cdt.msw.debug.core/.project
./old/windows/org.eclipse.cdt.msw.debug.native/.project
./old/windows/org.eclipse.cdt.msw.debug.ui/.project
./releng/old/org.eclipse.cdt.aix-feature/.project
./releng/old/org.eclipse.cdt.aix/.project
./releng/old/org.eclipse.cdt.linux.gtk-feature/.project
./releng/old/org.eclipse.cdt.linux.gtk/.project
./releng/old/org.eclipse.cdt.linux.motif-feature/.project
./releng/old/org.eclipse.cdt.linux.motif/.project
./releng/old/org.eclipse.cdt.qnx.photon-feature/.project
./releng/old/org.eclipse.cdt.qnx.photon/.project
./releng/old/org.eclipse.cdt.solaris.motif-feature/.project
./releng/old/org.eclipse.cdt.solaris.motif/.project
./releng/old/org.eclipse.cdt.source-feature/.project
./releng/old/org.eclipse.cdt.source/.project
./releng/old/org.eclipse.cdt.win32-feature/.project
./releng/old/org.eclipse.cdt.win32/.project


bash:jamesb:xl-cbga-20:32925> find . -type f -name .project |grep contrib
./contrib/org.eclipse.cdt.oprofile-home/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile-feature/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.core.linux/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.core/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.doc/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.launch/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.releng/.project
./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.ui/.project
./contrib/org.eclipse.cdt.rpm-home/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm-feature/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.core.tests/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.core/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.doc/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.propertypage/.project
./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.ui/.project

bash:jamesb:xl-cbga-20:32926> find . -type f -name .project |grep cppunit
./cppunit/org.eclipse.cdt-cppunit/org.eclipse.cdt.cppunit-feature/.project
./cppunit/org.eclipse.cdt-cppunit/org.eclipse.cdt.cppunit/.project

bash:jamesb:xl-cbga-20:32927> find . -type f -name .project |grep util
./util/org.antlr.runtime/.project
./util/org.antlr/.project
./util/org.eclipse.cdt.util-feature/.project
./util/org.eclipse.cdt.util/.project
./util/org.eclipse.ffs.core/.project
./util/org.eclipse.ffs.ui/.project
Comment 8 Mike Kucera CLA 2011-05-10 09:26:39 EDT
Please do not include 'c99' in the repository. It is obsolete and has been replaced by the lrparser.
Comment 9 Chris Recoskie CLA 2011-05-10 09:46:16 EDT
The list of bundles excluded seems fine to me.

I would rather not prune the history.  I often walk the history or turn on blame annotation to figure out why a piece of code is there.
Comment 10 Doug Schaefer CLA 2011-05-10 11:38:38 EDT
(In reply to comment #9)
> I would rather not prune the history.  I often walk the history or turn on
> blame annotation to figure out why a piece of code is there.

Pruning history is a last resort. I'm not even sure it's possible. If we can get the size down to around 150MB, I'd be happy.
Comment 11 James Blackburn CLA 2011-05-10 13:35:41 EDT
More authors:

I think I've resolved the remaining unix IDs apart from 2:
    'enriquev' : ('Enrique Varillas','enriquev'),
    'uid8941'  : ('uid8941','uid8941')


    'bfreeman' : ('Bjorn Freeman-Benson','bjorn.freeman-benson@eclipse.org'),
    'boxall' : ('Alan Boxall','boxall@ca.ibm.com'),
    'cecco' : ('Rob Cecco','cecco@ca.ibm.com'),
    'chanskw' : ('Samantha Chan','chanskw@ca.ibm.com'),
    'dmcknigh' : ('David McKnight','dmcknigh@ca.ibm.com'),
    'eyasser' : ('Yasser Elmankabady','eyasser@ca.ibm.com'),
    'jduimovich' : ('John Duimovich','jduimovich@sympatico.ca'),
    'jhandcock' : ('Jeremy Handcock','jeremy@aperte.org'),
    'khapitas' : ('Kleo Hapitas','khapitas@ca.ibm.com'),
    'kseitz' : ('Keith Seitz','keiths@redhat.com'),
    'mkwan' : ('Morris Kwan','mkwan@ca.ibm.com'),
    'rmoseley' : ('Rick Moseley','rmoseley@redhat.com'),
    'thomson' : ('Brian Thomson','thomson@ca.ibm.com'),
    'turnham' : ('Jeff Turnham','turnham@ca.ibm.com'),
    'weisz' : ('Robert Weisz','weisz@ca.ibm.com'),
Comment 12 Doug Schaefer CLA 2011-05-10 15:23:48 EDT
If we just physically upload your copy of the git repo (i.e. scp or something), do we need to have the authors match?
Comment 13 Doug Schaefer CLA 2011-05-10 15:24:51 EDT
Also a lot of those ids are coming from checkins that would be in the 'old' projects.
Comment 14 James Blackburn CLA 2011-05-10 15:40:37 EDT
(In reply to comment #12)
> If we just physically upload your copy of the git repo (i.e. scp or something),
> do we need to have the authors match?

Nope, we could get webmaster to pull the repo, or disable the pre-commit hook himself.  

Without old and the deprecated plugins discussed in the comments, we're at ~111M. I've updated:
https://github.com/jamesblackburn/org.eclipse.cdt
Comment 15 Doug Schaefer CLA 2011-05-10 15:43:59 EDT
(In reply to comment #14)
> (In reply to comment #12)
> > If we just physically upload your copy of the git repo (i.e. scp or something),
> > do we need to have the authors match?
> 
> Nope, we could get webmaster to pull the repo, or disable the pre-commit hook
> himself.  

I assume we'll have write permission to the git repo directory. But we'll see what he wants to do when it comes time.

> Without old and the deprecated plugins discussed in the comments, we're at
> ~111M. I've updated:
> https://github.com/jamesblackburn/org.eclipse.cdt

Excellent. Ship it :). I'll give it a try in a few minutes.
Comment 16 James Blackburn CLA 2011-05-10 15:47:12 EDT
Created attachment 195267 [details]
cvs2git.options

Added additional Unix UID -> Name + Email mappings
Comment 17 James Blackburn CLA 2011-05-10 15:53:10 EDT
Created attachment 195268 [details]
recipe.txt

Recipe for the conversion. Comment 5 + 


4.1 Move 'old' content into old:
mv org.eclipse.cdt/c99/ org.eclipse.cdt-old/c99
mv org.eclipse.cdt/cppunit/ org.eclipse.cdt-old/
mv org.eclipse.cdt-cppunit/ org.eclipse.cdt-old/cppunit
mv org.eclipse.cdt/releng/old/ org.eclipse.cdt-old/releng
mv org.eclipse.cdt/debug/old/ org.eclipse.cdt-old/debug
mv org.eclipse.cdt/contrib/ org.eclipse.cdt-old/contrib
mv org.eclipse.cdt/build/old/ org.eclipse.cdt-old/build
mv org.eclipse.cdt/util/ org.eclipse.cdt-old/
mv org.eclipse.cdt/core/old/ org.eclipse.cdt-old/core
Comment 18 James Blackburn CLA 2011-05-10 16:38:40 EDT
Created attachment 195276 [details]
list of .projects in the converted org.eclipse.cdt

Current list of projects.

I propose to move:
  core/org.eclipse.cdt.refactoring{.tests}
to old as it doesn't seem to be used and it's not references by the map.

Other than that, AFAICS everything else should be correct.
Comment 19 James Blackburn CLA 2011-05-10 16:52:53 EDT
Created attachment 195277 [details]
largest-files.txt

List of the largest ,v files:

One file stands out:
31M tools/org.eclipse.cdt/edc/org.eclipse.cdt.debug.edc.windows/os/win32/x86/EDCWindowsDebugAgent.exe,v
Comment 20 James Blackburn CLA 2011-05-10 18:12:34 EDT
Created attachment 195282 [details]
recipe.txt

4.1.1: Move refactoring:
mkdir org.eclipse.cdt-old/core/refactoring
mv org.eclipse.cdt/core/org.eclipse.cdt.refactoring* org.eclipse.cdt-old/core/refactoring/
Comment 21 James Blackburn CLA 2011-05-10 18:25:40 EDT
Two git repositories are now up to date:
  https://github.com/jamesblackburn/org.eclipse.cdt
  https://github.com/jamesblackburn/org.eclipse.cdt-old

Do let me know if there are any issues...
Comment 22 Doug Schaefer CLA 2011-05-10 20:21:32 EDT
(In reply to comment #17)
> mv org.eclipse.cdt/util/ org.eclipse.cdt-old/

Actually, we need the org.eclipse.cdt.util plug-in and feature from the util directory. Everything else can go.
Comment 23 Doug Schaefer CLA 2011-05-10 21:20:20 EDT
The good news is after adding the util plugin and feature, I was able to export the cdt-master for Indigo. I'll try that back to Ganymede/CDT 5 before the cutover.
Comment 24 Doug Schaefer CLA 2011-05-10 21:39:20 EDT
The next thing I notice is that we have way too many Tags. egit seems to be struggling with them all. All of the v tags that don't have builds on our build page can go. Not sure how hard that would be.

Thoughts?
Comment 25 James Blackburn CLA 2011-05-11 04:00:51 EDT
Created attachment 195306 [details]
recipe.txt

Woops, I verified the HEAD's matched, but didn't actually try to build the thing ;)

Changes:

4.1.2: Save util
mkdir org.eclipse.cdt/util
mv org.eclipse.cdt-old/util/org.eclipse.cdt.util* org.eclipse.cdt/util


8) Delete unwanted tags

   List tags with brief commit comment:
   git tag |xargs -I asdf -n 1 git show -s --format="%h: %cd %cn - %s - asdf" asdf |less
   
   # TODO: Delete tags which aren't in 'tags' file
   
   Show tags missing in the repository that are present built at: http://download.eclipse.org/tools/cdt/builds/
   cat ../tags|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors   

9) Prune + Repack the repository
  git prune
  git repack -a -d --depth=250 --window=250
  git gc --aggressive
  git repack -a -d --depth=250 --window=250
Comment 26 James Blackburn CLA 2011-05-11 04:02:02 EDT
Created attachment 195307 [details]
Tags

Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0.
Do we want to go back further than this? Or just preserve this set?
Comment 27 Chris Recoskie CLA 2011-05-11 09:08:41 EDT
(In reply to comment #26)
> Created attachment 195307 [details]
> Tags
> Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0.
> Do we want to go back further than this? Or just preserve this set?

For clarity, will the pruning prune branch tags, or just non-branch tags?  I assume the latter.
Comment 28 James Blackburn CLA 2011-05-11 09:14:32 EDT
(In reply to comment #27)
> For clarity, will the pruning prune branch tags, or just non-branch tags?  I
> assume the latter.

We're currently just discussing tags.  I'm tempted to include all tags which are still available for download from download.eclipse.org.

There are many fewer branches:

  CDT_2_0_2_BI
  NewParser1
  Parser_SymbolTable
  ScannerDiscovery61
  cdt_1_0_1
  cdt_1_1
  cdt_1_2
  cdt_21
  cdt_2_0
  cdt_3_0
  cdt_3_1
  cdt_4_0
  cdt_5_0
  cdt_5_0_0M5
  cdt_5_0_1post
  cdt_5_0_2post
  cdt_5_0post
  cdt_6_0
  cdt_6_0_2_special
  cdt_7_0
  cdt_7_0_1
  cdt_ast2
  unlabeled-1.6.2
  master
Comment 29 Andrew Gvozdev CLA 2011-05-11 09:49:16 EDT
(In reply to comment #24)
> The next thing I notice is that we have way too many Tags. egit seems to be
> struggling with them all. All of the v tags that don't have builds on our build
> page can go. Not sure how hard that would be.
> Thoughts?
The tags are important in CVS making it possible to checkout the whole project on a tag. I think v-tags are less of importance in git because you can always checkout the whole project on a particular commit (not possible in CVS). The v-tags give you the time and one could use that time to find a commit for checkout in git.
If you want to prune v-tags it's ok with me. I think we need to keep all tags CDT_X_X_X which tag releases and SR and recent v-tags for the maintenance releases.

There are a few experimental branches I suppose we gonna prune them as well?
Comment 30 Andrew Gvozdev CLA 2011-05-11 09:56:21 EDT
(In reply to comment #24)
> The next thing I notice is that we have way too many Tags. egit seems to be
> struggling with them all.
BTW I haven't noticed any problem with tags in egit for CDT repo. Not making a point just an observation. If you decide to keep them fine with me as well.
Comment 31 Doug Schaefer CLA 2011-05-11 11:07:11 EDT
(In reply to comment #26)
> Created attachment 195307 [details]
> Tags
> 
> Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0.
> Do we want to go back further than this? Or just preserve this set?

You the master :). Excellent.

I suppose we can go back. There isn't that many tags older than 5.0..
Comment 32 Doug Schaefer CLA 2011-05-11 11:14:17 EDT
(In reply to comment #29)
> (In reply to comment #24)
> > The next thing I notice is that we have way too many Tags. egit seems to be
> > struggling with them all. All of the v tags that don't have builds on our build
> > page can go. Not sure how hard that would be.
> > Thoughts?
> The tags are important in CVS making it possible to checkout the whole project
> on a tag. I think v-tags are less of importance in git because you can always
> checkout the whole project on a particular commit (not possible in CVS). The
> v-tags give you the time and one could use that time to find a commit for
> checkout in git.
> If you want to prune v-tags it's ok with me. I think we need to keep all tags
> CDT_X_X_X which tag releases and SR and recent v-tags for the maintenance
> releases.

Tags in git serve the same purpose. They make it easier to check out a particular build. The CDT_* tags are the main builds. The v-tags are for unofficial builds that people had asked for.

That being said, I'm not sure I'll keep applying the tags in the new releng system. We can simply record the commit id on the build page.

> 
> There are a few experimental branches I suppose we gonna prune them as well?

Good question. Should we only keep the cdt* branches? Mind you I'm very nostalgic for the NewParser1 branch which was my first involvement in CDT :).
Comment 33 Doug Schaefer CLA 2011-05-11 11:18:24 EDT
(In reply to comment #19)
> Created attachment 195277 [details]
> largest-files.txt
> 
> List of the largest ,v files:
> 
> One file stands out:
> 31M
> tools/org.eclipse.cdt/edc/org.eclipse.cdt.debug.edc.windows/os/win32/x86/EDCWindowsDebugAgent.exe,v

Back to this file. This is a binary file akin to the spawner.exe, but much larger, of course. Copying Ken. Do we expect this file to change a lot over time? How much is this going to add to our repo as it grows?
Comment 34 James Blackburn CLA 2011-05-11 12:38:45 EDT
Created attachment 195392 [details]
v-tags-to-preserve.txt

v20xxx tags to preserve (all available released builds).
Comment 35 James Blackburn CLA 2011-05-11 12:40:26 EDT
Created attachment 195393 [details]
recipe.txt

Added detail on removing unwanted tags + repo verification:

   # Show tags missing in the repository that are presently built at: http://download.eclipse.org/tools/cdt/builds/
   # Check that they're all there -> none that we're expecting are missing.
   cat ../v-tags-to-preserve.txt|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors
   
   # Remove any v20XXXX tags which aren't in our list of tags to keep.
   git tag |grep v20 > ../vTags.txt
   for i in `cat ../vTags.txt`; do
     if [[ `grep $i ../v-tags-to-preserve.txt` ]] ; then
       echo "Keeping Tag: $i"
     else
       echo "Deleting Tag: $i"
       git tag -d $i
     fi
   done
   
   # Check all the tags we want are still present
   cat ../v-tags-to-preserve.txt|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors
   
   # Remove Root_* tags
   git tag |grep Root_| xargs -n 1 git tag -d


10 Verify repo
  mkdir /tmp/compare-jamesb/  .../verify-cvs2svn.py --git ../cvs/tools/org.eclipse.cdt/ .../path_to_git_repo/ --tmp=/tmp/compare-jamesb/ --diff
Comment 36 James Blackburn CLA 2011-05-11 12:46:37 EDT
(In reply to comment #29)
> The
> v-tags give you the time and one could use that time to find a commit for
> checkout in git.

There are two issues here: 
 1) v20xxxx doesn't specify a branch.  So you have no idea which stream the tag was on.
 2) In CVS the tag needn't span the whole repository. New tags (in git) would be OK, but when preserving old tags we need to keep them around.

In the future I think we take Doug's advice and annotate the builds with both a timestamp and a SHA-1 that corresponds to the built artifact.


I've removed the v20XX tags not associated with an archived build, and removed the Root_* tags which are superflous.

This brings us to 185 tags and 24 branches.
Comment 37 Doug Schaefer CLA 2011-05-11 15:04:18 EDT
(In reply to comment #36)
> (In reply to comment #29)
> > The
> > v-tags give you the time and one could use that time to find a commit for
> > checkout in git.
> 
> There are two issues here: 
>  1) v20xxxx doesn't specify a branch.  So you have no idea which stream the tag
> was on.

The theory is that you start with the version qualifier for the build you care about and add a 'v' to the front and want to check it out to reproduce or create a branch from it.
Comment 38 Andrew Gvozdev CLA 2011-05-11 15:28:40 EDT
(In reply to comment #37)
> (In reply to comment #36)
> > (In reply to comment #29)
> > > The
> > > v-tags give you the time and one could use that time to find a commit for
> > > checkout in git.
> > There are two issues here:
> >  1) v20xxxx doesn't specify a branch.  So you have no idea which stream the
> > tag was on.
> The theory is that you start with the version qualifier for the build you care
> about and add a 'v' to the front and want to check it out to reproduce or create
> a branch from it.
But try to find the tag for the previous build on that branch (to figure if the issue was introduced with the build) and it is hard in CVS as you can't distinguish tags for maintenance releases which are built on the same day. Thankfully it is not issue with git.
Comment 39 Doug Schaefer CLA 2011-05-11 16:24:54 EDT
We can probably clear out the rest of the 'all' directory.

core.linux.ppc64 -> core
the gdb plugins -> debug
gnu.build-feature -> build
gnu.debug-feature -> debug
platform-feature -> releng

Yes/No? Although I have half a mind to create a gnu directory and put all host build and debug related things for the gnu toolchain into there.
Comment 40 James Blackburn CLA 2011-05-12 15:51:24 EDT
(In reply to comment #39)
> We can probably clear out the rest of the 'all' directory.
> 
> core.linux.ppc64 -> core
> the gdb plugins -> debug
> gnu.build-feature -> build
> gnu.debug-feature -> debug
> platform-feature -> releng
> 
> Yes/No? 

Will make a note to do that when we do the migration for real.

> Although I have half a mind to create a gnu directory and put all host
> build and debug related things for the gnu toolchain into there.

Moving stuff in git is basically free (egit bug 302549 notwithstanding). So you can do this after the fact whenever :).
Comment 41 Doug Schaefer CLA 2011-06-06 16:36:42 EDT
Everything's green so far.

James is going to investigate whether it's worth breaking EDC out into it's own repo to help with size and the TCF dependency. If it doesn't make much of a difference in size, then we'll just leave it.

With the help of Dave Carver and Alex Blewitt, we have Tycho figured out for the CDT. We'll transition the releng to Tycho immediately after the move. With Tycho, anyone can checkout the CDT out of git and build the master zip on their own machines without changing anything.

The plan is to make the CDT CVS repo's read only on June 22. I'll raise a seperate bug on webmaster to co-ordinate that. We'll run the script to convert over to git and have it up and running likely in a day or two.

Also during the move I'll create the cdt_8_0 branch and commits can get started towards CDT 8.0.1 in Sept. We'll use master towards the Juno release next year.

The promised screencasts are on their way. Look for them in a day or two. There is a lot of other material out there already so I'll focus mainly on things specific to CDT.
Comment 42 Doug Schaefer CLA 2011-06-07 15:31:21 EDT
OK, webmasters have given me ownership over the cdt CVS files. I have control to shut them down when the time comes.
Comment 43 James Blackburn CLA 2011-06-09 06:11:11 EDT
I've updated the repos on github:
https://github.com/jamesblackburn/org.eclipse.cdt
https://github.com/jamesblackburn/org.eclipse.cdt-edc
https://github.com/jamesblackburn/org.eclipse.cdt-old

cdt: Writing objects: 100% (415725/415725), 83.16 MiB | 1.08 MiB/s, done.
cdt-old: Writing objects: 100% (60895/60895), 53.21 MiB | 755 KiB/s, done.
cdt-edc: Writing objects: 100% (11827/11827), 31.95 MiB | 773 KiB/s, done.
Comment 44 James Blackburn CLA 2011-06-09 06:19:33 EDT
The following tags are missing:

fatal: ambiguous argument 'v201106081058': unknown revision or path not in the working tree.
fatal: ambiguous argument 'v201106061419': unknown revision or path not in the working tree.
fatal: ambiguous argument 'v201105301135': unknown revision or path not in the working tree.

The first two because the snapshot was taken on 2011-06-03, the last one it looks like head wasn't tagged when the release was made...
Comment 45 James Blackburn CLA 2011-06-09 06:24:07 EDT
Created attachment 197677 [details]
v-tags-to-preserve.txt

Add recent I-Builds:

CDT 7 stream
118a119
> v201105261654

HEAD stream
122a124,128
> v201106081058 (RC4)
> v201106061419
> v201105301135 (RC3)
> v201105201622 (RC2)
> v201105160958 (RC1)
Comment 46 James Blackburn CLA 2011-06-09 06:25:27 EDT
Created attachment 197678 [details]
recipe.txt

Separate out EDC ; move remainder of 'all' into place:

47a48,58
> 4.1.3: Separate out EDC
> mv org.eclipse.cdt/edc/ org.eclipse.cdt.edc
>
> 4.1.4: Move plugins under 'all' to their rightful place
> mv org.eclipse.cdt/all/org.eclipse.cdt.core.linux.ppc64/ org.eclipse.cdt/core/
> mv org.eclipse.cdt/all/org.eclipse.cdt.gdb* org.eclipse.cdt/debug/
> mv org.eclipse.cdt/all/org.eclipse.cdt.gnu.build-feature/ org.eclipse.cdt/build/
> mv org.eclipse.cdt/all/org.eclipse.cdt.gnu.debug-feature/ org.eclipse.cdt/debug/
> mv org.eclipse.cdt/all/org.eclipse.cdt.platform-feature/ org.eclipse.cdt/releng/
>
>
83,84c94,95
<    git tag |grep v20 > ../vTags.txt
<    for i in `cat ../vTags.txt`; do
---
>    git tag |grep v20 > vTags.txt
>    for i in `cat vTags.txt`; do
Comment 47 James Blackburn CLA 2011-06-09 06:26:50 EDT
Diff between the .options file for org.eclipse.cdt and org.eclipse.cdt-edc:

bash:jamesb:xl-cbga-20:33631> diff cvs2git.options cvs2git-edc.options
128c128
< ctx.tmpdir = r'cvs2svn-tmp'
---
> ctx.tmpdir = r'cvs2svn-edc-tmp'
166c166
< ctx.revision_collector = ExternalBlobGenerator('cvs2svn-tmp/git-blob.dat')
---
> ctx.revision_collector = ExternalBlobGenerator('cvs2svn-edc-tmp/git-blob.dat')
265c265
< ctx.symbol_info_filename = 'symbol-info.txt'
---
> ctx.symbol_info_filename = 'symbol-info-edc.txt'
632c632
<     r'cvs/tools/org.eclipse.cdt',
---
>     r'cvs/tools/org.eclipse.cdt-edc',


And similarly for org.eclipse.cdt-old
Comment 48 Doug Schaefer CLA 2011-06-09 10:20:04 EDT
Excellent. Thanks James. Any thoughts on EDC? 32MB versus 83MB for the rest is pretty significant. But then so is the pain of optional parts of the CDT build.

Also, could we put all the features into the releng directory? These should hardly ever change except when we're playing around with the releng.
Comment 49 James Blackburn CLA 2011-06-10 04:28:26 EDT
(In reply to comment #48)
> Excellent. Thanks James. Any thoughts on EDC? 32MB versus 83MB for the rest is
> pretty significant. But then so is the pain of optional parts of the CDT build.

Given it adds an additional 50% space requirement, and it has an external TCF dependency, I'd be tempted to leave edc in a separate repo.

> Also, could we put all the features into the releng directory? These should
> hardly ever change except when we're playing around with the releng.

I'm not sure about this one... Where features are part of a component, to me it makes sense to keep them there.  So if we ever decided to split out a component having the feature go along with the referenced plugins would be reasonable, no? 

I think the features currently look like this: 

find . -type d -name *-feature
./build/org.eclipse.cdt.gnu.build-feature
./codan/org.eclipse.cdt.codan-feature
./cross/org.eclipse.cdt.build.crossgcc-feature
./cross/org.eclipse.cdt.launch.remote-feature
./debug/org.eclipse.cdt.gdb-feature
./debug/org.eclipse.cdt.gnu.debug-feature
./dsf-gdb/org.eclipse.cdt.gnu.dsf-feature
./dsf/org.eclipse.cdt.examples.dsf-feature
./jtag/org.eclipse.cdt.debug.gdbjtag-feature
./memory/org.eclipse.cdt.debug.ui.memory-feature
./p2/org.eclipse.cdt.p2-feature
./releng/org.eclipse.cdt-feature
./releng/org.eclipse.cdt.platform-feature
./releng/org.eclipse.cdt.sdk-feature
./releng/org.eclipse.cdt.testing-feature
./upc/org.eclipse.cdt.bupc-feature
./util/org.eclipse.cdt.util-feature
./windows/org.eclipse.cdt.msw-feature
./xlc/org.eclipse.cdt.xlc.sdk-feature
Comment 50 Doug Schaefer CLA 2011-06-10 12:10:46 EDT
(In reply to comment #49)
> Given it adds an additional 50% space requirement, and it has an external TCF
> dependency, I'd be tempted to leave edc in a separate repo.

Sold. We'll make EDC a separate repo. I thought about the build script and I think it's probably a pretty minor change in the end to deal with it. I'll have to deal with TCF being a separate repo anyway.

> I'm not sure about this one... Where features are part of a component, to me it
> makes sense to keep them there.  So if we ever decided to split out a component
> having the feature go along with the referenced plugins would be reasonable,
> no? 

Fair enough. I don't have a strong opinion on that. +1 for your plan.
Comment 51 Doug Schaefer CLA 2011-06-10 16:29:14 EDT
BTW, I have uploaded the two repos James created to our git space under the test2 folder.

   http://git.eclipse.org/c/cdt/test2/org.eclipse.cdt.git/
   http://git.eclipse.org/c/cdt/test2/org.eclipse.cdt-edc.git/

Cheers,
Doug.
Comment 52 Marc Khouzam CLA 2011-06-13 10:54:58 EDT
Did we talk about .gitignore files for our repos?
Comment 53 David Carver CLA 2011-06-13 11:11:59 EDT
(In reply to comment #52)
> Did we talk about .gitignore files for our repos?

If you will be using Maven for the builds then at the minum you want to have .gitignore that includes the following:

bin
target


You might also want to consider .gitattributes as well.

http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
Comment 54 Doug Schaefer CLA 2011-06-13 11:16:36 EDT
(In reply to comment #52)
> Did we talk about .gitignore files for our repos?

Do you have something specific in mind? At the moment I don't see anything that needs .gitignore other than what Dave has pointed out with bin and target.
Comment 55 Marc Khouzam CLA 2011-06-13 12:07:03 EDT
(In reply to comment #54)
> (In reply to comment #52)
> > Did we talk about .gitignore files for our repos?
> 
> Do you have something specific in mind? At the moment I don't see anything that
> needs .gitignore other than what Dave has pointed out with bin and target.

Just the bin/ directories that I see in my 'git status'.  I assume that .gitignore should be part of the repo itself?
Comment 56 Doug Schaefer CLA 2011-06-13 14:15:14 EDT
(In reply to comment #55)
> Just the bin/ directories that I see in my 'git status'.  I assume that
> .gitignore should be part of the repo itself?

egit ignores the bin directories anyway.

But yes, each project should get a .gitignore with bin and target (target will come from Maven/Tycho). I'll do that at conversion time.
Comment 57 Marc Khouzam CLA 2011-06-13 15:17:34 EDT
Also, will the CDT Genie work for Git commits?
Comment 58 Doug Schaefer CLA 2011-06-13 15:20:49 EDT
Probably not until someone makes it work with Git. Or we move to Gerrit and get it to update bugzilla.
Comment 59 David Carver CLA 2011-06-13 15:56:58 EDT
(In reply to comment #58)
> Probably not until someone makes it work with Git. Or we move to Gerrit and get
> it to update bugzilla.

If CDT Genie is a commit hook you have a couple of different options.

You can use Git Zilla, which is a git pre-receive hook for when somebody pushes to a git repository:

http://www.theoldmonk.net/gitzilla/

You could have the Webmasters install the Hudson Bugzilla plugin:

http://wiki.hudson-ci.org/display/HUDSON/Bugzilla+Plugin
Comment 60 James Blackburn CLA 2011-06-14 13:09:12 EDT
Doug, do we have a date when we want the conversion to happen for real yet?  We'll need the webmaster to update tools-cvs.tgz on:
http://archive.eclipse.org/arch/
(and I'll need a chance to download this from here).  
It would be good if we did this before Indigo goes live and clobbers eclipse.org ;)
Comment 61 Doug Schaefer CLA 2011-06-14 14:00:42 EDT
I have a feeling they'll be really busy when we need them for this.

As one option, I could create the tarball myself and make it available on our download area.

And we can do that anytime.

And I can do it so that we only include the CDT folders. 2.7GB! no wonder it was taking so long to download.
Comment 62 Andrew Gvozdev CLA 2011-06-14 16:22:06 EDT
James, could you remove branch ScannerDiscovery61 from the repository while doing conversion? It is not useful anymore as I hosted it on GitHub. I'll merge my changes in there after the final conversion.
Comment 63 Marc Khouzam CLA 2011-06-15 10:40:43 EDT
James showed me that we can include the DSF/DSF-GDB history of the DD project into the CDT Git repo!

That will allow to get access to many of the early design decision and is very valuable.

James, can you graft that history into the main repo?

Thanks a lot!
Comment 64 James Blackburn CLA 2011-06-15 16:02:23 EDT
Created attachment 198050 [details]
recipe.txt

8.1) Delete unwanted branches
   git branch -D ScannerDiscovery61
   
9) Graft in history from other projects

9.1   Add DSF
   git remote add dsf ../org.eclipse.dd.dsf
   git fetch dsf master
   
   #Find Pawel's first commit of DSF into CDT
      git log --pretty=oneline |grep "Migrated DSF and DSF-GDB to the CDT project." |tail -n 1|awk '{print $1}' |tr -d '\n' > grafts
      echo -n " " >> grafts
   #It still needs to point at its parent:
      git log --pretty=oneline |grep "Migrated DSF and DSF-GDB to the CDT project." |tail -n 1|awk '{print $1}' |xargs git rev-list -n 2|tail -n 1 |tr -d '\n' >> grafts
   #Graft the dsf DAG into CDT. This is the last commit to DSF before Pawel's commit to CDT.  The SHA-1 won't change as DSF is archived and I'll only import it once :)
      echo " c1e6da229b8ffcea160498f034bfa6bc8ff6f230" >> grafts
   #Move the graft in - we need to do this last as the graft will mess up the git log's above
      cat grafts >> .git/info/grafts
       
   Now check the history with gitk or likewise.  git show <commit_id> should show it was a merge.
   
9.2   Add the traditional-memory history:
      git remote add memory ../org.eclipse.dd.memory/
      git fetch memory master
      git log --pretty=oneline |grep "DSDP-DD -> CDT initial commit" |tail -n 1|awk '{print $1}' |tr -d '\n' > grafts
      echo -n " " >> grafts
      git log --pretty=oneline |grep "DSDP-DD -> CDT initial commit" |tail -n 1|awk '{print $1}' |xargs git rev-list -n 2|tail -n 1 |tr -d '\n' >> grafts
      echo " 8a526a1b6d440e8078cb51f345dfa914615b6a6c" >> grafts
      cat grafts >> .git/info/grafts
      
11) Make the grafts permanent
      git fast-export --all | (mkdir ../org.eclipse.cdt2 && cd ../org.eclipse.cdt2 && git init && git fast-import)
Comment 65 James Blackburn CLA 2011-06-15 16:10:20 EDT
(In reply to comment #63)
> James, can you graft that history into the main repo?

I've grafted in the DSF history and the Traditional memory history from DSDP into the CDT git repo.

This work because git doesn't actually track renames in the repo. rather reconstructs via cleverness. So grafting the DAG from one repo into the first commit of the moved content in another, causes the added content to now have history as blame and log detect the file rename / move.

Increases the size of the repo. by a couple MB:
Writing objects: 100% (433211/433211), 85.01 MiB | 779 KiB/s, done.

I've updated: 
https://github.com/jamesblackburn/org.eclipse.cdt
Marc, let me know of any issues.
Comment 66 Marc Khouzam CLA 2011-06-16 11:14:42 EDT
(In reply to comment #65)
> (In reply to comment #63)
> > James, can you graft that history into the main repo?
> 
> I've grafted in the DSF history and the Traditional memory history from DSDP
> into the CDT git repo.

Very smart of you, I hadn't thought about it.

> I've updated: 
> https://github.com/jamesblackburn/org.eclipse.cdt
> Marc, let me know of any issues.

I cloned it and it looks great, although I just checked a couple of files.

Seeing the history is not working properly because of EGit/git limitations for now, as you pointed out.  If we can find a way to do it, we should post it to this bug.
Comment 67 James Blackburn CLA 2011-06-16 13:22:40 EDT
(In reply to comment #66)
> Seeing the history is not working properly because of EGit/git limitations for
> now, as you pointed out.  If we can find a way to do it, we should post it to
> this bug.

I've done some digging on this: 
  git-blame works fine, because it's magic.  
  git log, even with --follow doesn't because it's 'a hack'(1)  

See my question to the git mailing list:
http://git.661346.n2.nabble.com/git-log-follow-doesn-t-follow-a-rename-over-a-merge-td6480971.html

For the moment, to view the full log of a DSF / Memory file, use blame-log.sh shell script from here:
http://git.661346.n2.nabble.com/alternate-log-follow-idea-td1385917.html

Or, alternatively: use git blame <file>, then git log --follow -- <original_file_path>

Given blame knows where the content has come from I'm sure we can persuade the egit people to get the history view to tell us :) (2)

(1) From Linus: "I really never wanted the pain, and never cared enough for it, which is why --follow is such a hack. It literally was designed as a "SVN noob" 
pleaser, not as a "real git functionality" thing."
http://kerneltrap.org/mailarchive/git/2009/1/30/4861064
(2) There are plans for cgit to fix this:
http://git.661346.n2.nabble.com/gsoc-Better-git-log-follow-support-td6188083.html
Comment 68 Doug Schaefer CLA 2011-06-16 14:50:49 EDT
We need to make a list of bugs open against egit. I am really unhappy with the quality of it right now the more I use it with the test repo. Here are the main issues so far:

- performance of status updating and commit
- NPE when creating patches
- Merge workflow on rebase pukes at times
Comment 69 David Carver CLA 2011-06-16 14:59:10 EDT
(In reply to comment #68)
> We need to make a list of bugs open against egit. I am really unhappy with the
> quality of it right now the more I use it with the test repo. Here are the main
> issues so far:
> 
> - performance of status updating and commit
> - NPE when creating patches
> - Merge workflow on rebase pukes at times

Or just add the bugs as dependency here.   Also, great time to create patches, and test cases if you can replicate specific issues.   EGit/JGit is pretty responsive to issues especially if they have reproducable test cases.
Comment 70 Doug Schaefer CLA 2011-06-16 15:12:03 EDT
(In reply to comment #69)
> Or just add the bugs as dependency here.   Also, great time to create patches,
> and test cases if you can replicate specific issues.   EGit/JGit is pretty
> responsive to issues especially if they have reproducable test cases.

We have a test case the egit devs have full access to. 
   git://git.eclipse.org/gitroot/cdt/test3/org.eclipse.cdt.git.

I'm standing on the ledge and I need someone to talk me down before I wave the conversion off until these issues are resolved.
Comment 71 David Carver CLA 2011-06-16 16:37:23 EDT
(In reply to comment #70)
> (In reply to comment #69)
> > Or just add the bugs as dependency here.   Also, great time to create patches,
> > and test cases if you can replicate specific issues.   EGit/JGit is pretty
> > responsive to issues especially if they have reproducable test cases.
> 
> We have a test case the egit devs have full access to. 
>    git://git.eclipse.org/gitroot/cdt/test3/org.eclipse.cdt.git.
> 
> I'm standing on the ledge and I need someone to talk me down before I wave the
> conversion off until these issues are resolved.

I'm not a committer on CDT so I have not stake in the conversion, but I will say this unless projects convert to git and use EGit, you won't find the bugs and performance issues.   If they can address the performance items early in the the upcoming dev cycle, then you are fine.  There ARE work arounds for the issue, until then, and it won't stop people from working.  Yes it'll be inconvient for some, but I view EGit like the CVS client was early in it's development cycle, it'll improve as people use it.
Comment 72 Doug Schaefer CLA 2011-06-20 11:33:23 EDT
Patching using git patches works for me. I've removed the bug for the NPE we saw with non-git patches from our depends on list.

I'm still trying to figure out the merge conflict resolution workflow. Which I want to have for the final green light for the move.

The performance issue is not really a killer and we do have the egit gang working on a solution. I'll remove the depends on for that too.
Comment 73 Doug Schaefer CLA 2011-06-23 12:32:14 EDT
OK, CVS made read-only, tar ball created and available for download.
Comment 74 James Blackburn CLA 2011-06-24 12:13:57 EDT
Created attachment 198548 [details]
recipe.txt

2.5) Ensure directories are writable
find org.eclipse.cdt* -type d -exec chmod ug+w "{}" \;

8.1) Delete unwanted branches
   git branch -D ScannerDiscovery61
   git tag -d SD61-01
   git tag -d ScannerDiscovery61_Contributors
Comment 75 James Blackburn CLA 2011-06-24 12:15:39 EDT
Created attachment 198549 [details]
v-tags-to-preserve.txt

+ v201106081058 (Final)
Comment 76 James Blackburn CLA 2011-06-24 15:22:24 EDT
Repos. are at: http://git.eclipse.org/c/cdt/org.eclipse.cdt

cdt main: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/

cdt.edc: http://git.eclipse.org/c/cdt/org.eclipse.cdt.edc.git/

cdt.old: http://git.eclipse.org/c/cdt/org.eclipse.cdt.old.git/
    old = everything else that wasn't included in the conversion above.
    One tag was renamed: HEAD -> HEAD_CVS (as HEAD is magic ref in git).

SHA-1s for cdt.edc and cdt haven't diverged since the last conversion (though for cdt.main the last conversion was the github one, which contains the DSDP-DSF and DSDP-memory history grafted in).

All tags and branches verify correctly for main and edc.  A couple files on master, which contain expansion tags ($Id) differ:
On master these are:
 + codan/org.eclipse.cdt.codan.core.test/src/org/eclipse/cdt/codan/core/internal/checkers/UnusedSymbolInFileScopeCheckerTest.java
 + codan/org.eclipse.cdt.codan.ui/src/org/eclipse/cdt/codan/ui/LabelFieldEditor.java

Doug, Marc and others please verify -- I'm hoping this is done :)
Comment 77 Andrew Overholt CLA 2011-06-24 16:35:47 EDT
Everything cloned quickly for me and things are building as I type this.  It'll now be much easier to follow CDT development!  Thanks for all the hard work, James, Doug, et al!
Comment 78 James Blackburn CLA 2011-06-24 18:40:26 EDT
Finished verifying old. A bunch of deltas in $Id and $Name - nothing semantic.
Comment 79 Doug Schaefer CLA 2011-06-24 20:26:48 EDT
Huge thanks to James for all his work on making this happen. I can take my eyes off my workspace. Looks sweet. :) Marking closed.