PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

From Subversion to GIT (and beyond!)

Note: This article was originally published at Planet PHP on 17 April 5980.
Planet PHP

Here's a more or less simple way to migrate from Subversion to GIT(hub), this includes mapping commits and tags and what not!

Authors

If multiple people congtributed to your project, this is probably the toughest part. If you're not migration from let's say Google Code but PHP's Subversion repository, then it's really pretty simple indeed: the username is the email address.

I found a nifty bash script to get it done (and adjusted it a little bit):

#!/usr/bin/env bash authors=$(svn log -q | grep -e '^r' | awk 'BEGIN { FS = "|" } ; { print $2 }' | sort | uniq) for author in ${authors}; do echo "${author} = ${author} "; done

Since I migrated my project already, I didn't have the Subversion tree handy. That's why I used another package I maintain to demo this.

This is how you run it (assumes you have chmod +x'd it before):

# ./authors.sh cvs2svn = cvs2svn cweiske = cweiske danielc = danielc gwynne = gwynne kguest = kguest pmjones = pmjones rasmus = rasmus till = till

If you redirect the output to authors.txt, you're done.

Note: In case people don't have the email address you used on their Github account, they can always add it later on. Github allows you to use multiple email addresses, which is pretty handy for stuff like open source and work-related repositories.

git clone

This part took me a long time to figure out - especially because of the semi-weird setup in Google Code. The wiki is in Subversion as well, so the repository root is not the root where the software lives. This is probably a non-issue if you want to migrate the wiki as well, but I don't see why you would cluter master with it. Instead, I'd suggest to migrate the wiki content into a seperate branch.

Without further ado, this works:

# git svn clone --authors-file=./authors.txt --no-metadata --prefix=svn/ \ --tags=Services_ProjectHoneyPot/tags \ --trunk=Services_ProjectHoneyPot/trunk \ --branches=Services_ProjectHoneyPot/branches \ http://services-projecthoneypot.googlecode.com/svn/ \ Services_ProjectHoneyPot/

The final steps are to add a new remote and push master to it. Done!

You can see the result on Github.

A shortcut with Google Code

I facepalm'd myself when I saw that I could have converted my project on Google Code from Subversion to GIT.

This seems a lot easier since it would allow me to just push the branch to Github without cloning etc.. I'm not sure how it would spin off the wiki-content and how author information is preserved, but I suggest you try it out in case you want to migrate your project off of Google Code.

Doing it the other way is not time wasted since I had to figure out the steps regardless.

Summary

There seem to be literally a million ways to migrate (from Subversion) to GIT. I hope this example got you one step closer to your objective.

The biggest problem migrating is, that often people in Subversion-land screw up tags by committing to them (I'm guilty as well). I try to avoid that in open source, but as far as work is concerned, we sometimes need to hotfix a tag and just redeploy instead of running through the process of recreating a new tag (which is sometimes super tedious when a large Subversion repository is involved).

When I migrated a repository the other day, I noticed that these tags became branches in GIT. The good news is, I won't be able to do this anymore with GIT (Yay!), which is good because it forces me to create a more robust and streamlined process to get code into production. But how do I fix this problem during the migration?

Fix up your tags

If you happen to run into this problem, my suggestion is to migrate trunk only and then re-create the tags in GIT.

GIT allows me to create a tag based on a branch or based on a commit. Both options are simple and much better than installing a couple Python- and/or Ruby-scripts to fix your tree, which either end up not working or require a PHD in Math to understand.

To create a tag from a branch, I check out the branch and tag it. This may work best during a migration and of course it depends on how many tags need to be (re-)created and if you had tags at all. Creating a tag based on a commit comes in handy, when you forgot to create a tag at all: for example, you fixed a bug for a new release and then ended up refactoring a whole lot more.

In order to get the history of your GIT repository, try git log:

"/

Truncated by Planet PHP, read more at the original (another 2572 bytes)