| Summary: | CDT git migration | ||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Tools] CDT | Reporter: | James Blackburn <jamesblackburn+eclipse> | ||||||||||||||||||||||||||||||||
| Component: | cdt-releng | Assignee: | cdt-releng-inbox <cdt-releng-inbox> | ||||||||||||||||||||||||||||||||
| Status: | CLOSED FIXED | QA Contact: | Doug Schaefer <cdtdoug> | ||||||||||||||||||||||||||||||||
| Severity: | normal | ||||||||||||||||||||||||||||||||||
| Priority: | P3 | CC: | aleherb+eclipse, caniszczyk, cdtdoug, cjashfor, d_a_carver, ken.ryall, malaperle, marc.khouzam, mikekucera, pwebster, recoskie, vivkong | ||||||||||||||||||||||||||||||||
| Version: | 7.0 | ||||||||||||||||||||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||||||||||||||||||
| Hardware: | PC | ||||||||||||||||||||||||||||||||||
| OS: | All | ||||||||||||||||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||||||||||||||||
| Bug Depends on: | 303404, 316211, 316212, 348474 | ||||||||||||||||||||||||||||||||||
| Bug Blocks: | 345659 | ||||||||||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||||||||||
|
Description
James Blackburn
James, I tried your repo on github. Is it possible to filter out all directories with /old/ in them? These are things we don't care about in recent history if ever. Also, is there any way to prune the history? Say for example we declare that the CDT started with the CDT 5.0 drop and remove all versions and branches before that. If anyone wants history before that we'll have the CVS repos available in archive. contrib and util would be other candidates for not moving over. and cppunit :). (In reply to comment #1) > Also, is there any way to prune the history? Say for example we declare that the > CDT started with the CDT 5.0 drop and remove all versions and branches before > that. If anyone wants history before that we'll have the CVS repos available in > archive. I don't get why we want to prune the history, often it is invaluable. But if we do, maybe we can rather have 2 Git repositories, one with full history and another one pruned? I would *much* prefer to work with the full history in one repository. Created attachment 195181 [details] cvs2git.options Recipe for conversion. Options used for the conversion attached. Steps to reproduce: 1) Grab a full copy of the tools repository: wget http://archive.eclipse.org/arch/tools-cvs.tgz Create a directory to work in. 2) Unpack CDT bits tar -zxvf .../tools-cvs.tgz cvs/tools/org.eclipse.cdt cvs/tools/org.eclipse.cdt-build cvs/tools/org.eclipse.cdt-contrib cvs/tools/org.eclipse.cdt-core cvs/tools/org.eclipse.cdt-cppunit cvs/tools/org.eclipse.cdt-debug cvs/tools/org.eclipse.cdt-doc cvs/tools/org.eclipse.cdt-launch cvs/tools/org.eclipse.cdt-old cvs/tools/org.eclipse.cdt-releng cvs/tools/CVSROOT 3) Move the externally stored CDT components into place: pushd cvs/tools/ mv org.eclipse.cdt-build/ org.eclipse.cdt/build mv org.eclipse.cdt-contrib/ org.eclipse.cdt/contrib mv org.eclipse.cdt-core/ org.eclipse.cdt/core mv org.eclipse.cdt-cppunit/ org.eclipse.cdt/cppunit mv org.eclipse.cdt-debug/ org.eclipse.cdt/debug mv org.eclipse.cdt-doc/ org.eclipse.cdt/doc mv org.eclipse.cdt-launch/ org.eclipse.cdt/launch mv org.eclipse.cdt-old/ org.eclipse.cdt/old mv org.eclipse.cdt-releng/* org.eclipse.cdt/releng/ popd 4) Remove broken symlinks in the repo (from all to the CDT components we moved above) find cvs/ -type l|xargs -n 1 rm 5) Run cvs2git to do the conversion: http://cvs2svn.tigris.org/cvs2git.html 5.1) Get latest cvs2git: svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk 5.2) .../cvs2svn/cvs2git --options=cvs2git.options (options file attached) 6) Import the cvs2git output into git mkdir org.eclipse.cdt cd org.eclipse.cdt git init cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import 7) Move tags into place python ..../cvs2svn/contrib/git-move-refs.py A full conversion, includincvs2svn Statistics: ------------------ Total CVS Files: 24753 Total CVS Revisions: 150689 Total CVS Branches: 124784 Total CVS Tags: 9305400 Total Unique Tags: 1164 Total Unique Branches: 27 CVS Repos Size in KB: 703716 Total SVN Commits: 27098 First Revision Date: Mon Sep 10 23:15:52 2001 Last Revision Date: Sat Apr 30 19:20:53 2011 The size of the git repository is 250M. The size of the checkout 257M (total size: 507M). (In reply to comment #1) > Is it possible to filter out all > directories with /old/ in them? These are things we don't care about in recent > history if ever. > > Also, is there any way to prune the history? We can do both of these if wanted. I thought I would just do the whole lot first to see what the upper bound is... Obvious things to prune are anything not included in the most recent releng tag: - build/old - contrib/ - core/old - debug/old - old/ - releng/old - cppunit/ If you think of anything else please add... (In reply to comment #4) > I don't get why we want to prune the history, often it is invaluable. But if we > do, maybe we can rather have 2 Git repositories, one with full history and > another one pruned? I would *much* prefer to work with the full history in one > repository. Well there's stuff that isn't built, tested, or released. I agree we should be careful about preserving mainline history. But experimental stuff that was dropped can likely just go. Authors: Using bugzilla I reconstructed author names and emails. The list looks like this for (recent-ish) committers. Author: Alain Magloire <alain@qnx.com> Author: Alena Laskavaia <elaskavaia.cdt@gmail.com> Author: Andrew Ferguson <andrew.ferguson@symbian.com> Author: Andrew Gvozdev <angvoz.dev@gmail.com> Author: Andrew Niefer <aniefer@ca.ibm.com> Author: Anton Leherbauer <anton.leherbauer@windriver.com> Author: Bogdan Gheorghe <gheorghe@ca.ibm.com> Author: Chris Recoskie <recoskie@ca.ibm.com> Author: Chris Wiebe <cwiebe@ftml.net> Author: David Daoust <dave.daoust@windriver.com> Author: David Dubrow <david.dubrow@nokia.com> Author: David Inglis <dinglis@qnx.com> Author: Doug Schaefer <doug.schaefer@windriver.com> Author: Ed Swartz <ed.swartz@nokia.com> Author: Emanuel Graf <egraf@hsr.ch> Author: Francois Chouinard <fchouinard@gmail.com> Author: Hoda Amer <hamer@ca.ibm.com> Author: James Blackburn <jamesblackburn+eclipse@gmail.com> Author: Jason Montojo <jason.montojo@gmail.com> Author: John Camelon <jcamelon@ca.ibm.com> Author: John Cortell <john.cortell@freescale.com> Author: Judy N. Green <jgreen@qnx.com> Author: Ken Ryall <ken.ryall@nokia.com> Author: L. Frank Turovich <frank.turovich@nokia.com> Author: Leo Treggiari <leo.treggiari@intel.com> Author: Ling Wang <ling.5.wang@nokia.com> Author: Marc Khouzam <marc.khouzam@ericsson.com> Author: Marc-Andre Laperle <malaperle@omnialabs.net> Author: Markus Schorn <markus.schorn@windriver.com> Author: Martin Lescuyer <mlescuyer@rational.com> Author: Mike Kucera <mkucera@ca.ibm.com> Author: Mikhail Khodjaiants <mikhailkhod@googlemail.com> Author: Mikhail Sennikovsky <mikhail.sennikovskiy@gmail.com> Author: Norbert Plött <norbert.ploett@siemens.com> Author: Oleg Krasilnikov <oleg.krasilnikov@intel.com> Author: Patrick Chuong <pchuong@ti.com> Author: Pawel Piech <pawel.piech@windriver.com> Author: Peter Graves <pgraves@qnx.com> Author: Randy Rohrbach <Randy.Rohrbach@Windriver.com> Author: Sean Evoy <sevoy@ca.ibm.com> Author: Sebastien Marineau <sebastien@qnx.com> Author: Sergey Prigogin <eclipse.sprigogin@gmail.com> Author: Tanya-Marise De Sousa <tdesous@ca.ibm.com> Author: Ted Williams <ted@ted.net> Author: Teodor Madan <teodor.madan@freescale.com> Author: Thomas Fletcher <thomasf@qnx.com> Author: Vivian Kong <vivkong@ca.ibm.com> Author: Vladimir Hirsl <vhirsl@ca.ibm.com> Author: Warren Paul <warren.paul@nokia.com> The following aren't listed on dash as past committers. I'll grovel in the eclipse mirror to see if the UIDs are mapped. Author: bfreeman <> Author: boxall <> Author: cecco <> Author: chanskw <> Author: dmcknigh <> Author: enriquev <> Author: eyasser <> Author: jduimovich <> Author: jhandcock <> Author: khapitas <> Author: kseitz <> Author: mkwan <> Author: rmoseley <> Author: thomson <> Author: turnham <> Author: uid8941 <uid8941> Author: weisz <> Bundles we propose to exclude in the conversion: bash:jamesb:xl-cbga-20:32924> find . -type f -name .project |grep old ./build/old/org.eclipse.cdt.make-feature/.project ./build/old/org.eclipse.cdt.managedbuilder-feature/.project ./build/old/org.eclipse.cdt.managedbuilder.msvc.core/.project ./core/old/org.eclipse.cdt.pdom.core/.project ./core/old/org.eclipse.cdt.pdom.ui/.project ./debug/old/org.eclipse.cdt.debug.win32.core/.project ./debug/old/org.eclipse.cdt.debug.win32.ui/.project ./old/android/org.eclipse.cdt.android.core/.project ./old/android/org.eclipse.cdt.android.ui/.project ./old/android2/org.eclipse.cdt.android.build.core/.project ./old/android2/org.eclipse.cdt.android.build.ui/.project ./old/android2/org.eclipse.cdt.android.debug.core/.project ./old/android2/org.eclipse.cdt.android.debug.ui/.project ./old/android2/org.eclipse.cdt.android.feature/.project ./old/build2/org.eclipse.cdt.build.core/.project ./old/build2/org.eclipse.cdt.build.ui/.project ./old/mylyn/org.eclipse.cdt.mylyn-feature/.project ./old/mylyn/org.eclipse.cdt.mylyn.ui/.project ./old/mylyn/org.eclipse.cdt.mylyn/.project ./old/old2/cdt-home/.project ./old/old2/org.eclipse.cdt.build.core.tests/.project ./old/old2/org.eclipse.cdt.build.core/.project ./old/old2/org.eclipse.cdt.build.gnu.core/.project ./old/old2/org.eclipse.cdt.build.ui/.project ./old/old2/org.eclipse.cdt.mingw-feature/.project ./old/old2/org.eclipse.cdt.mingw.build/.project ./old/old2/org.eclipse.cdt.mingw.debug/.project ./old/old2/org.eclipse.cdt.mingw/.project ./old/old2/org.eclipse.cdt.windows.debug.core.cdi/.project ./old/old2/org.eclipse.cdt.windows.debug.native/.project ./old/org.eclipse.cdt.old/cdt-cpp-extensions-home/.project ./old/org.eclipse.cdt.old/com.ibm.debug.common/.project ./old/org.eclipse.cdt.old/com.ibm.debug.daemon/.project ./old/org.eclipse.cdt.old/com.ibm.debug.pdt/.project ./old/org.eclipse.cdt.old/com.ibm.lpex/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.docs.user/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.miners.parser/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.miners/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.cpp.ui/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.debug.gdbPicl/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.core/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.extra.server/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.extra/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.hosts/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.miners/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.dstore.ui/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.linux.help/.project ./old/org.eclipse.cdt.old/org.eclipse.cdt.pa.ui/.project ./old/windows/old/org.eclipse.cdt.windows-feature/.project ./old/windows/old/org.eclipse.cdt.windows.build/.project ./old/windows/old/org.eclipse.cdt.windows.debug.cdi.core/.project ./old/windows/old/org.eclipse.cdt.windows.debug.cdi.ui/.project ./old/windows/old/org.eclipse.cdt.windows.debug.core/.project ./old/windows/old/org.eclipse.cdt.windows.debug.native/.project ./old/windows/old/org.eclipse.cdt.windows.debug.tests/.project ./old/windows/old/org.eclipse.cdt.windows/.project ./old/windows/old/really_old/org.eclipse.cdt.windows.debug.core/.project ./old/windows/old/really_old/org.eclipse.cdt.windows.debug.debugger/.project ./old/windows/old/really_old/org.eclipse.cdt.windows.debug.tests.app/.project ./old/windows/old/really_old/org.eclipse.cdt.windows.debug.tests/.project ./old/windows/old/really_old/org.eclipse.cdt.windows.debug.ui/.project ./old/windows/org.eclipse.cdt.csharp.build/.project ./old/windows/org.eclipse.cdt.csharp.core.tests/.project ./old/windows/org.eclipse.cdt.csharp.core/.project ./old/windows/org.eclipse.cdt.csharp.msw.build/.project ./old/windows/org.eclipse.cdt.csharp.ui/.project ./old/windows/org.eclipse.cdt.msw.debug.core.tests/.project ./old/windows/org.eclipse.cdt.msw.debug.core/.project ./old/windows/org.eclipse.cdt.msw.debug.native/.project ./old/windows/org.eclipse.cdt.msw.debug.ui/.project ./releng/old/org.eclipse.cdt.aix-feature/.project ./releng/old/org.eclipse.cdt.aix/.project ./releng/old/org.eclipse.cdt.linux.gtk-feature/.project ./releng/old/org.eclipse.cdt.linux.gtk/.project ./releng/old/org.eclipse.cdt.linux.motif-feature/.project ./releng/old/org.eclipse.cdt.linux.motif/.project ./releng/old/org.eclipse.cdt.qnx.photon-feature/.project ./releng/old/org.eclipse.cdt.qnx.photon/.project ./releng/old/org.eclipse.cdt.solaris.motif-feature/.project ./releng/old/org.eclipse.cdt.solaris.motif/.project ./releng/old/org.eclipse.cdt.source-feature/.project ./releng/old/org.eclipse.cdt.source/.project ./releng/old/org.eclipse.cdt.win32-feature/.project ./releng/old/org.eclipse.cdt.win32/.project bash:jamesb:xl-cbga-20:32925> find . -type f -name .project |grep contrib ./contrib/org.eclipse.cdt.oprofile-home/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile-feature/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.core.linux/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.core/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.doc/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.launch/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.releng/.project ./contrib/org.eclipse.cdt.oprofile/org.eclipse.cdt.oprofile.ui/.project ./contrib/org.eclipse.cdt.rpm-home/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm-feature/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.core.tests/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.core/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.doc/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.propertypage/.project ./contrib/org.eclipse.cdt.rpm/org.eclipse.cdt.rpm.ui/.project bash:jamesb:xl-cbga-20:32926> find . -type f -name .project |grep cppunit ./cppunit/org.eclipse.cdt-cppunit/org.eclipse.cdt.cppunit-feature/.project ./cppunit/org.eclipse.cdt-cppunit/org.eclipse.cdt.cppunit/.project bash:jamesb:xl-cbga-20:32927> find . -type f -name .project |grep util ./util/org.antlr.runtime/.project ./util/org.antlr/.project ./util/org.eclipse.cdt.util-feature/.project ./util/org.eclipse.cdt.util/.project ./util/org.eclipse.ffs.core/.project ./util/org.eclipse.ffs.ui/.project Please do not include 'c99' in the repository. It is obsolete and has been replaced by the lrparser. The list of bundles excluded seems fine to me. I would rather not prune the history. I often walk the history or turn on blame annotation to figure out why a piece of code is there. (In reply to comment #9) > I would rather not prune the history. I often walk the history or turn on > blame annotation to figure out why a piece of code is there. Pruning history is a last resort. I'm not even sure it's possible. If we can get the size down to around 150MB, I'd be happy. More authors:
I think I've resolved the remaining unix IDs apart from 2:
'enriquev' : ('Enrique Varillas','enriquev'),
'uid8941' : ('uid8941','uid8941')
'bfreeman' : ('Bjorn Freeman-Benson','bjorn.freeman-benson@eclipse.org'),
'boxall' : ('Alan Boxall','boxall@ca.ibm.com'),
'cecco' : ('Rob Cecco','cecco@ca.ibm.com'),
'chanskw' : ('Samantha Chan','chanskw@ca.ibm.com'),
'dmcknigh' : ('David McKnight','dmcknigh@ca.ibm.com'),
'eyasser' : ('Yasser Elmankabady','eyasser@ca.ibm.com'),
'jduimovich' : ('John Duimovich','jduimovich@sympatico.ca'),
'jhandcock' : ('Jeremy Handcock','jeremy@aperte.org'),
'khapitas' : ('Kleo Hapitas','khapitas@ca.ibm.com'),
'kseitz' : ('Keith Seitz','keiths@redhat.com'),
'mkwan' : ('Morris Kwan','mkwan@ca.ibm.com'),
'rmoseley' : ('Rick Moseley','rmoseley@redhat.com'),
'thomson' : ('Brian Thomson','thomson@ca.ibm.com'),
'turnham' : ('Jeff Turnham','turnham@ca.ibm.com'),
'weisz' : ('Robert Weisz','weisz@ca.ibm.com'),
If we just physically upload your copy of the git repo (i.e. scp or something), do we need to have the authors match? Also a lot of those ids are coming from checkins that would be in the 'old' projects. (In reply to comment #12) > If we just physically upload your copy of the git repo (i.e. scp or something), > do we need to have the authors match? Nope, we could get webmaster to pull the repo, or disable the pre-commit hook himself. Without old and the deprecated plugins discussed in the comments, we're at ~111M. I've updated: https://github.com/jamesblackburn/org.eclipse.cdt (In reply to comment #14) > (In reply to comment #12) > > If we just physically upload your copy of the git repo (i.e. scp or something), > > do we need to have the authors match? > > Nope, we could get webmaster to pull the repo, or disable the pre-commit hook > himself. I assume we'll have write permission to the git repo directory. But we'll see what he wants to do when it comes time. > Without old and the deprecated plugins discussed in the comments, we're at > ~111M. I've updated: > https://github.com/jamesblackburn/org.eclipse.cdt Excellent. Ship it :). I'll give it a try in a few minutes. Created attachment 195267 [details]
cvs2git.options
Added additional Unix UID -> Name + Email mappings
Created attachment 195268 [details] recipe.txt Recipe for the conversion. Comment 5 + 4.1 Move 'old' content into old: mv org.eclipse.cdt/c99/ org.eclipse.cdt-old/c99 mv org.eclipse.cdt/cppunit/ org.eclipse.cdt-old/ mv org.eclipse.cdt-cppunit/ org.eclipse.cdt-old/cppunit mv org.eclipse.cdt/releng/old/ org.eclipse.cdt-old/releng mv org.eclipse.cdt/debug/old/ org.eclipse.cdt-old/debug mv org.eclipse.cdt/contrib/ org.eclipse.cdt-old/contrib mv org.eclipse.cdt/build/old/ org.eclipse.cdt-old/build mv org.eclipse.cdt/util/ org.eclipse.cdt-old/ mv org.eclipse.cdt/core/old/ org.eclipse.cdt-old/core Created attachment 195276 [details]
list of .projects in the converted org.eclipse.cdt
Current list of projects.
I propose to move:
core/org.eclipse.cdt.refactoring{.tests}
to old as it doesn't seem to be used and it's not references by the map.
Other than that, AFAICS everything else should be correct.
Created attachment 195277 [details]
largest-files.txt
List of the largest ,v files:
One file stands out:
31M tools/org.eclipse.cdt/edc/org.eclipse.cdt.debug.edc.windows/os/win32/x86/EDCWindowsDebugAgent.exe,v
Created attachment 195282 [details]
recipe.txt
4.1.1: Move refactoring:
mkdir org.eclipse.cdt-old/core/refactoring
mv org.eclipse.cdt/core/org.eclipse.cdt.refactoring* org.eclipse.cdt-old/core/refactoring/
Two git repositories are now up to date: https://github.com/jamesblackburn/org.eclipse.cdt https://github.com/jamesblackburn/org.eclipse.cdt-old Do let me know if there are any issues... (In reply to comment #17) > mv org.eclipse.cdt/util/ org.eclipse.cdt-old/ Actually, we need the org.eclipse.cdt.util plug-in and feature from the util directory. Everything else can go. The good news is after adding the util plugin and feature, I was able to export the cdt-master for Indigo. I'll try that back to Ganymede/CDT 5 before the cutover. The next thing I notice is that we have way too many Tags. egit seems to be struggling with them all. All of the v tags that don't have builds on our build page can go. Not sure how hard that would be. Thoughts? Created attachment 195306 [details] recipe.txt Woops, I verified the HEAD's matched, but didn't actually try to build the thing ;) Changes: 4.1.2: Save util mkdir org.eclipse.cdt/util mv org.eclipse.cdt-old/util/org.eclipse.cdt.util* org.eclipse.cdt/util 8) Delete unwanted tags List tags with brief commit comment: git tag |xargs -I asdf -n 1 git show -s --format="%h: %cd %cn - %s - asdf" asdf |less # TODO: Delete tags which aren't in 'tags' file Show tags missing in the repository that are present built at: http://download.eclipse.org/tools/cdt/builds/ cat ../tags|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors 9) Prune + Repack the repository git prune git repack -a -d --depth=250 --window=250 git gc --aggressive git repack -a -d --depth=250 --window=250 Created attachment 195307 [details] Tags Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0. Do we want to go back further than this? Or just preserve this set? (In reply to comment #26) > Created attachment 195307 [details] > Tags > Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0. > Do we want to go back further than this? Or just preserve this set? For clarity, will the pruning prune branch tags, or just non-branch tags? I assume the latter. (In reply to comment #27) > For clarity, will the pruning prune branch tags, or just non-branch tags? I > assume the latter. We're currently just discussing tags. I'm tempted to include all tags which are still available for download from download.eclipse.org. There are many fewer branches: CDT_2_0_2_BI NewParser1 Parser_SymbolTable ScannerDiscovery61 cdt_1_0_1 cdt_1_1 cdt_1_2 cdt_21 cdt_2_0 cdt_3_0 cdt_3_1 cdt_4_0 cdt_5_0 cdt_5_0_0M5 cdt_5_0_1post cdt_5_0_2post cdt_5_0post cdt_6_0 cdt_6_0_2_special cdt_7_0 cdt_7_0_1 cdt_ast2 unlabeled-1.6.2 master (In reply to comment #24) > The next thing I notice is that we have way too many Tags. egit seems to be > struggling with them all. All of the v tags that don't have builds on our build > page can go. Not sure how hard that would be. > Thoughts? The tags are important in CVS making it possible to checkout the whole project on a tag. I think v-tags are less of importance in git because you can always checkout the whole project on a particular commit (not possible in CVS). The v-tags give you the time and one could use that time to find a commit for checkout in git. If you want to prune v-tags it's ok with me. I think we need to keep all tags CDT_X_X_X which tag releases and SR and recent v-tags for the maintenance releases. There are a few experimental branches I suppose we gonna prune them as well? (In reply to comment #24) > The next thing I notice is that we have way too many Tags. egit seems to be > struggling with them all. BTW I haven't noticed any problem with tags in egit for CDT repo. Not making a point just an observation. If you decide to keep them fine with me as well. (In reply to comment #26) > Created attachment 195307 [details] > Tags > > Tags from: http://download.eclipse.org/tools/cdt/builds/ since 5.0.0. > Do we want to go back further than this? Or just preserve this set? You the master :). Excellent. I suppose we can go back. There isn't that many tags older than 5.0.. (In reply to comment #29) > (In reply to comment #24) > > The next thing I notice is that we have way too many Tags. egit seems to be > > struggling with them all. All of the v tags that don't have builds on our build > > page can go. Not sure how hard that would be. > > Thoughts? > The tags are important in CVS making it possible to checkout the whole project > on a tag. I think v-tags are less of importance in git because you can always > checkout the whole project on a particular commit (not possible in CVS). The > v-tags give you the time and one could use that time to find a commit for > checkout in git. > If you want to prune v-tags it's ok with me. I think we need to keep all tags > CDT_X_X_X which tag releases and SR and recent v-tags for the maintenance > releases. Tags in git serve the same purpose. They make it easier to check out a particular build. The CDT_* tags are the main builds. The v-tags are for unofficial builds that people had asked for. That being said, I'm not sure I'll keep applying the tags in the new releng system. We can simply record the commit id on the build page. > > There are a few experimental branches I suppose we gonna prune them as well? Good question. Should we only keep the cdt* branches? Mind you I'm very nostalgic for the NewParser1 branch which was my first involvement in CDT :). (In reply to comment #19) > Created attachment 195277 [details] > largest-files.txt > > List of the largest ,v files: > > One file stands out: > 31M > tools/org.eclipse.cdt/edc/org.eclipse.cdt.debug.edc.windows/os/win32/x86/EDCWindowsDebugAgent.exe,v Back to this file. This is a binary file akin to the spawner.exe, but much larger, of course. Copying Ken. Do we expect this file to change a lot over time? How much is this going to add to our repo as it grows? Created attachment 195392 [details]
v-tags-to-preserve.txt
v20xxx tags to preserve (all available released builds).
Created attachment 195393 [details] recipe.txt Added detail on removing unwanted tags + repo verification: # Show tags missing in the repository that are presently built at: http://download.eclipse.org/tools/cdt/builds/ # Check that they're all there -> none that we're expecting are missing. cat ../v-tags-to-preserve.txt|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors # Remove any v20XXXX tags which aren't in our list of tags to keep. git tag |grep v20 > ../vTags.txt for i in `cat ../vTags.txt`; do if [[ `grep $i ../v-tags-to-preserve.txt` ]] ; then echo "Keeping Tag: $i" else echo "Deleting Tag: $i" git tag -d $i fi done # Check all the tags we want are still present cat ../v-tags-to-preserve.txt|awk '{print $1}'|xargs -n 1 git show -s --pretty=oneline 2> errors # Remove Root_* tags git tag |grep Root_| xargs -n 1 git tag -d 10 Verify repo mkdir /tmp/compare-jamesb/ .../verify-cvs2svn.py --git ../cvs/tools/org.eclipse.cdt/ .../path_to_git_repo/ --tmp=/tmp/compare-jamesb/ --diff (In reply to comment #29) > The > v-tags give you the time and one could use that time to find a commit for > checkout in git. There are two issues here: 1) v20xxxx doesn't specify a branch. So you have no idea which stream the tag was on. 2) In CVS the tag needn't span the whole repository. New tags (in git) would be OK, but when preserving old tags we need to keep them around. In the future I think we take Doug's advice and annotate the builds with both a timestamp and a SHA-1 that corresponds to the built artifact. I've removed the v20XX tags not associated with an archived build, and removed the Root_* tags which are superflous. This brings us to 185 tags and 24 branches. (In reply to comment #36) > (In reply to comment #29) > > The > > v-tags give you the time and one could use that time to find a commit for > > checkout in git. > > There are two issues here: > 1) v20xxxx doesn't specify a branch. So you have no idea which stream the tag > was on. The theory is that you start with the version qualifier for the build you care about and add a 'v' to the front and want to check it out to reproduce or create a branch from it. (In reply to comment #37) > (In reply to comment #36) > > (In reply to comment #29) > > > The > > > v-tags give you the time and one could use that time to find a commit for > > > checkout in git. > > There are two issues here: > > 1) v20xxxx doesn't specify a branch. So you have no idea which stream the > > tag was on. > The theory is that you start with the version qualifier for the build you care > about and add a 'v' to the front and want to check it out to reproduce or create > a branch from it. But try to find the tag for the previous build on that branch (to figure if the issue was introduced with the build) and it is hard in CVS as you can't distinguish tags for maintenance releases which are built on the same day. Thankfully it is not issue with git. We can probably clear out the rest of the 'all' directory. core.linux.ppc64 -> core the gdb plugins -> debug gnu.build-feature -> build gnu.debug-feature -> debug platform-feature -> releng Yes/No? Although I have half a mind to create a gnu directory and put all host build and debug related things for the gnu toolchain into there. (In reply to comment #39) > We can probably clear out the rest of the 'all' directory. > > core.linux.ppc64 -> core > the gdb plugins -> debug > gnu.build-feature -> build > gnu.debug-feature -> debug > platform-feature -> releng > > Yes/No? Will make a note to do that when we do the migration for real. > Although I have half a mind to create a gnu directory and put all host > build and debug related things for the gnu toolchain into there. Moving stuff in git is basically free (egit bug 302549 notwithstanding). So you can do this after the fact whenever :). Everything's green so far. James is going to investigate whether it's worth breaking EDC out into it's own repo to help with size and the TCF dependency. If it doesn't make much of a difference in size, then we'll just leave it. With the help of Dave Carver and Alex Blewitt, we have Tycho figured out for the CDT. We'll transition the releng to Tycho immediately after the move. With Tycho, anyone can checkout the CDT out of git and build the master zip on their own machines without changing anything. The plan is to make the CDT CVS repo's read only on June 22. I'll raise a seperate bug on webmaster to co-ordinate that. We'll run the script to convert over to git and have it up and running likely in a day or two. Also during the move I'll create the cdt_8_0 branch and commits can get started towards CDT 8.0.1 in Sept. We'll use master towards the Juno release next year. The promised screencasts are on their way. Look for them in a day or two. There is a lot of other material out there already so I'll focus mainly on things specific to CDT. OK, webmasters have given me ownership over the cdt CVS files. I have control to shut them down when the time comes. I've updated the repos on github: https://github.com/jamesblackburn/org.eclipse.cdt https://github.com/jamesblackburn/org.eclipse.cdt-edc https://github.com/jamesblackburn/org.eclipse.cdt-old cdt: Writing objects: 100% (415725/415725), 83.16 MiB | 1.08 MiB/s, done. cdt-old: Writing objects: 100% (60895/60895), 53.21 MiB | 755 KiB/s, done. cdt-edc: Writing objects: 100% (11827/11827), 31.95 MiB | 773 KiB/s, done. The following tags are missing: fatal: ambiguous argument 'v201106081058': unknown revision or path not in the working tree. fatal: ambiguous argument 'v201106061419': unknown revision or path not in the working tree. fatal: ambiguous argument 'v201105301135': unknown revision or path not in the working tree. The first two because the snapshot was taken on 2011-06-03, the last one it looks like head wasn't tagged when the release was made... Created attachment 197677 [details] v-tags-to-preserve.txt Add recent I-Builds: CDT 7 stream 118a119 > v201105261654 HEAD stream 122a124,128 > v201106081058 (RC4) > v201106061419 > v201105301135 (RC3) > v201105201622 (RC2) > v201105160958 (RC1) Created attachment 197678 [details] recipe.txt Separate out EDC ; move remainder of 'all' into place: 47a48,58 > 4.1.3: Separate out EDC > mv org.eclipse.cdt/edc/ org.eclipse.cdt.edc > > 4.1.4: Move plugins under 'all' to their rightful place > mv org.eclipse.cdt/all/org.eclipse.cdt.core.linux.ppc64/ org.eclipse.cdt/core/ > mv org.eclipse.cdt/all/org.eclipse.cdt.gdb* org.eclipse.cdt/debug/ > mv org.eclipse.cdt/all/org.eclipse.cdt.gnu.build-feature/ org.eclipse.cdt/build/ > mv org.eclipse.cdt/all/org.eclipse.cdt.gnu.debug-feature/ org.eclipse.cdt/debug/ > mv org.eclipse.cdt/all/org.eclipse.cdt.platform-feature/ org.eclipse.cdt/releng/ > > 83,84c94,95 < git tag |grep v20 > ../vTags.txt < for i in `cat ../vTags.txt`; do --- > git tag |grep v20 > vTags.txt > for i in `cat vTags.txt`; do Diff between the .options file for org.eclipse.cdt and org.eclipse.cdt-edc: bash:jamesb:xl-cbga-20:33631> diff cvs2git.options cvs2git-edc.options 128c128 < ctx.tmpdir = r'cvs2svn-tmp' --- > ctx.tmpdir = r'cvs2svn-edc-tmp' 166c166 < ctx.revision_collector = ExternalBlobGenerator('cvs2svn-tmp/git-blob.dat') --- > ctx.revision_collector = ExternalBlobGenerator('cvs2svn-edc-tmp/git-blob.dat') 265c265 < ctx.symbol_info_filename = 'symbol-info.txt' --- > ctx.symbol_info_filename = 'symbol-info-edc.txt' 632c632 < r'cvs/tools/org.eclipse.cdt', --- > r'cvs/tools/org.eclipse.cdt-edc', And similarly for org.eclipse.cdt-old Excellent. Thanks James. Any thoughts on EDC? 32MB versus 83MB for the rest is pretty significant. But then so is the pain of optional parts of the CDT build. Also, could we put all the features into the releng directory? These should hardly ever change except when we're playing around with the releng. (In reply to comment #48) > Excellent. Thanks James. Any thoughts on EDC? 32MB versus 83MB for the rest is > pretty significant. But then so is the pain of optional parts of the CDT build. Given it adds an additional 50% space requirement, and it has an external TCF dependency, I'd be tempted to leave edc in a separate repo. > Also, could we put all the features into the releng directory? These should > hardly ever change except when we're playing around with the releng. I'm not sure about this one... Where features are part of a component, to me it makes sense to keep them there. So if we ever decided to split out a component having the feature go along with the referenced plugins would be reasonable, no? I think the features currently look like this: find . -type d -name *-feature ./build/org.eclipse.cdt.gnu.build-feature ./codan/org.eclipse.cdt.codan-feature ./cross/org.eclipse.cdt.build.crossgcc-feature ./cross/org.eclipse.cdt.launch.remote-feature ./debug/org.eclipse.cdt.gdb-feature ./debug/org.eclipse.cdt.gnu.debug-feature ./dsf-gdb/org.eclipse.cdt.gnu.dsf-feature ./dsf/org.eclipse.cdt.examples.dsf-feature ./jtag/org.eclipse.cdt.debug.gdbjtag-feature ./memory/org.eclipse.cdt.debug.ui.memory-feature ./p2/org.eclipse.cdt.p2-feature ./releng/org.eclipse.cdt-feature ./releng/org.eclipse.cdt.platform-feature ./releng/org.eclipse.cdt.sdk-feature ./releng/org.eclipse.cdt.testing-feature ./upc/org.eclipse.cdt.bupc-feature ./util/org.eclipse.cdt.util-feature ./windows/org.eclipse.cdt.msw-feature ./xlc/org.eclipse.cdt.xlc.sdk-feature (In reply to comment #49) > Given it adds an additional 50% space requirement, and it has an external TCF > dependency, I'd be tempted to leave edc in a separate repo. Sold. We'll make EDC a separate repo. I thought about the build script and I think it's probably a pretty minor change in the end to deal with it. I'll have to deal with TCF being a separate repo anyway. > I'm not sure about this one... Where features are part of a component, to me it > makes sense to keep them there. So if we ever decided to split out a component > having the feature go along with the referenced plugins would be reasonable, > no? Fair enough. I don't have a strong opinion on that. +1 for your plan. BTW, I have uploaded the two repos James created to our git space under the test2 folder. http://git.eclipse.org/c/cdt/test2/org.eclipse.cdt.git/ http://git.eclipse.org/c/cdt/test2/org.eclipse.cdt-edc.git/ Cheers, Doug. Did we talk about .gitignore files for our repos? (In reply to comment #52) > Did we talk about .gitignore files for our repos? If you will be using Maven for the builds then at the minum you want to have .gitignore that includes the following: bin target You might also want to consider .gitattributes as well. http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html (In reply to comment #52) > Did we talk about .gitignore files for our repos? Do you have something specific in mind? At the moment I don't see anything that needs .gitignore other than what Dave has pointed out with bin and target. (In reply to comment #54) > (In reply to comment #52) > > Did we talk about .gitignore files for our repos? > > Do you have something specific in mind? At the moment I don't see anything that > needs .gitignore other than what Dave has pointed out with bin and target. Just the bin/ directories that I see in my 'git status'. I assume that .gitignore should be part of the repo itself? (In reply to comment #55) > Just the bin/ directories that I see in my 'git status'. I assume that > .gitignore should be part of the repo itself? egit ignores the bin directories anyway. But yes, each project should get a .gitignore with bin and target (target will come from Maven/Tycho). I'll do that at conversion time. Also, will the CDT Genie work for Git commits? Probably not until someone makes it work with Git. Or we move to Gerrit and get it to update bugzilla. (In reply to comment #58) > Probably not until someone makes it work with Git. Or we move to Gerrit and get > it to update bugzilla. If CDT Genie is a commit hook you have a couple of different options. You can use Git Zilla, which is a git pre-receive hook for when somebody pushes to a git repository: http://www.theoldmonk.net/gitzilla/ You could have the Webmasters install the Hudson Bugzilla plugin: http://wiki.hudson-ci.org/display/HUDSON/Bugzilla+Plugin Doug, do we have a date when we want the conversion to happen for real yet? We'll need the webmaster to update tools-cvs.tgz on: http://archive.eclipse.org/arch/ (and I'll need a chance to download this from here). It would be good if we did this before Indigo goes live and clobbers eclipse.org ;) I have a feeling they'll be really busy when we need them for this. As one option, I could create the tarball myself and make it available on our download area. And we can do that anytime. And I can do it so that we only include the CDT folders. 2.7GB! no wonder it was taking so long to download. James, could you remove branch ScannerDiscovery61 from the repository while doing conversion? It is not useful anymore as I hosted it on GitHub. I'll merge my changes in there after the final conversion. James showed me that we can include the DSF/DSF-GDB history of the DD project into the CDT Git repo! That will allow to get access to many of the early design decision and is very valuable. James, can you graft that history into the main repo? Thanks a lot! Created attachment 198050 [details]
recipe.txt
8.1) Delete unwanted branches
git branch -D ScannerDiscovery61
9) Graft in history from other projects
9.1 Add DSF
git remote add dsf ../org.eclipse.dd.dsf
git fetch dsf master
#Find Pawel's first commit of DSF into CDT
git log --pretty=oneline |grep "Migrated DSF and DSF-GDB to the CDT project." |tail -n 1|awk '{print $1}' |tr -d '\n' > grafts
echo -n " " >> grafts
#It still needs to point at its parent:
git log --pretty=oneline |grep "Migrated DSF and DSF-GDB to the CDT project." |tail -n 1|awk '{print $1}' |xargs git rev-list -n 2|tail -n 1 |tr -d '\n' >> grafts
#Graft the dsf DAG into CDT. This is the last commit to DSF before Pawel's commit to CDT. The SHA-1 won't change as DSF is archived and I'll only import it once :)
echo " c1e6da229b8ffcea160498f034bfa6bc8ff6f230" >> grafts
#Move the graft in - we need to do this last as the graft will mess up the git log's above
cat grafts >> .git/info/grafts
Now check the history with gitk or likewise. git show <commit_id> should show it was a merge.
9.2 Add the traditional-memory history:
git remote add memory ../org.eclipse.dd.memory/
git fetch memory master
git log --pretty=oneline |grep "DSDP-DD -> CDT initial commit" |tail -n 1|awk '{print $1}' |tr -d '\n' > grafts
echo -n " " >> grafts
git log --pretty=oneline |grep "DSDP-DD -> CDT initial commit" |tail -n 1|awk '{print $1}' |xargs git rev-list -n 2|tail -n 1 |tr -d '\n' >> grafts
echo " 8a526a1b6d440e8078cb51f345dfa914615b6a6c" >> grafts
cat grafts >> .git/info/grafts
11) Make the grafts permanent
git fast-export --all | (mkdir ../org.eclipse.cdt2 && cd ../org.eclipse.cdt2 && git init && git fast-import)
(In reply to comment #63) > James, can you graft that history into the main repo? I've grafted in the DSF history and the Traditional memory history from DSDP into the CDT git repo. This work because git doesn't actually track renames in the repo. rather reconstructs via cleverness. So grafting the DAG from one repo into the first commit of the moved content in another, causes the added content to now have history as blame and log detect the file rename / move. Increases the size of the repo. by a couple MB: Writing objects: 100% (433211/433211), 85.01 MiB | 779 KiB/s, done. I've updated: https://github.com/jamesblackburn/org.eclipse.cdt Marc, let me know of any issues. (In reply to comment #65) > (In reply to comment #63) > > James, can you graft that history into the main repo? > > I've grafted in the DSF history and the Traditional memory history from DSDP > into the CDT git repo. Very smart of you, I hadn't thought about it. > I've updated: > https://github.com/jamesblackburn/org.eclipse.cdt > Marc, let me know of any issues. I cloned it and it looks great, although I just checked a couple of files. Seeing the history is not working properly because of EGit/git limitations for now, as you pointed out. If we can find a way to do it, we should post it to this bug. (In reply to comment #66) > Seeing the history is not working properly because of EGit/git limitations for > now, as you pointed out. If we can find a way to do it, we should post it to > this bug. I've done some digging on this: git-blame works fine, because it's magic. git log, even with --follow doesn't because it's 'a hack'(1) See my question to the git mailing list: http://git.661346.n2.nabble.com/git-log-follow-doesn-t-follow-a-rename-over-a-merge-td6480971.html For the moment, to view the full log of a DSF / Memory file, use blame-log.sh shell script from here: http://git.661346.n2.nabble.com/alternate-log-follow-idea-td1385917.html Or, alternatively: use git blame <file>, then git log --follow -- <original_file_path> Given blame knows where the content has come from I'm sure we can persuade the egit people to get the history view to tell us :) (2) (1) From Linus: "I really never wanted the pain, and never cared enough for it, which is why --follow is such a hack. It literally was designed as a "SVN noob" pleaser, not as a "real git functionality" thing." http://kerneltrap.org/mailarchive/git/2009/1/30/4861064 (2) There are plans for cgit to fix this: http://git.661346.n2.nabble.com/gsoc-Better-git-log-follow-support-td6188083.html We need to make a list of bugs open against egit. I am really unhappy with the quality of it right now the more I use it with the test repo. Here are the main issues so far: - performance of status updating and commit - NPE when creating patches - Merge workflow on rebase pukes at times (In reply to comment #68) > We need to make a list of bugs open against egit. I am really unhappy with the > quality of it right now the more I use it with the test repo. Here are the main > issues so far: > > - performance of status updating and commit > - NPE when creating patches > - Merge workflow on rebase pukes at times Or just add the bugs as dependency here. Also, great time to create patches, and test cases if you can replicate specific issues. EGit/JGit is pretty responsive to issues especially if they have reproducable test cases. (In reply to comment #69) > Or just add the bugs as dependency here. Also, great time to create patches, > and test cases if you can replicate specific issues. EGit/JGit is pretty > responsive to issues especially if they have reproducable test cases. We have a test case the egit devs have full access to. git://git.eclipse.org/gitroot/cdt/test3/org.eclipse.cdt.git. I'm standing on the ledge and I need someone to talk me down before I wave the conversion off until these issues are resolved. (In reply to comment #70) > (In reply to comment #69) > > Or just add the bugs as dependency here. Also, great time to create patches, > > and test cases if you can replicate specific issues. EGit/JGit is pretty > > responsive to issues especially if they have reproducable test cases. > > We have a test case the egit devs have full access to. > git://git.eclipse.org/gitroot/cdt/test3/org.eclipse.cdt.git. > > I'm standing on the ledge and I need someone to talk me down before I wave the > conversion off until these issues are resolved. I'm not a committer on CDT so I have not stake in the conversion, but I will say this unless projects convert to git and use EGit, you won't find the bugs and performance issues. If they can address the performance items early in the the upcoming dev cycle, then you are fine. There ARE work arounds for the issue, until then, and it won't stop people from working. Yes it'll be inconvient for some, but I view EGit like the CVS client was early in it's development cycle, it'll improve as people use it. Patching using git patches works for me. I've removed the bug for the NPE we saw with non-git patches from our depends on list. I'm still trying to figure out the merge conflict resolution workflow. Which I want to have for the final green light for the move. The performance issue is not really a killer and we do have the egit gang working on a solution. I'll remove the depends on for that too. OK, CVS made read-only, tar ball created and available for download. Created attachment 198548 [details]
recipe.txt
2.5) Ensure directories are writable
find org.eclipse.cdt* -type d -exec chmod ug+w "{}" \;
8.1) Delete unwanted branches
git branch -D ScannerDiscovery61
git tag -d SD61-01
git tag -d ScannerDiscovery61_Contributors
Created attachment 198549 [details]
v-tags-to-preserve.txt
+ v201106081058 (Final)
Repos. are at: http://git.eclipse.org/c/cdt/org.eclipse.cdt cdt main: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/ cdt.edc: http://git.eclipse.org/c/cdt/org.eclipse.cdt.edc.git/ cdt.old: http://git.eclipse.org/c/cdt/org.eclipse.cdt.old.git/ old = everything else that wasn't included in the conversion above. One tag was renamed: HEAD -> HEAD_CVS (as HEAD is magic ref in git). SHA-1s for cdt.edc and cdt haven't diverged since the last conversion (though for cdt.main the last conversion was the github one, which contains the DSDP-DSF and DSDP-memory history grafted in). All tags and branches verify correctly for main and edc. A couple files on master, which contain expansion tags ($Id) differ: On master these are: + codan/org.eclipse.cdt.codan.core.test/src/org/eclipse/cdt/codan/core/internal/checkers/UnusedSymbolInFileScopeCheckerTest.java + codan/org.eclipse.cdt.codan.ui/src/org/eclipse/cdt/codan/ui/LabelFieldEditor.java Doug, Marc and others please verify -- I'm hoping this is done :) Everything cloned quickly for me and things are building as I type this. It'll now be much easier to follow CDT development! Thanks for all the hard work, James, Doug, et al! Finished verifying old. A bunch of deltas in $Id and $Name - nothing semantic. Huge thanks to James for all his work on making this happen. I can take my eyes off my workspace. Looks sweet. :) Marking closed. |