25 February 2008

Git: Start As a Superior SVN, then Leverage Even More

Git has been getting a fair bit of attention lately. I am relatively new to Git, but am definitely a convert and big fan after on a short time using it. I'm to the point where I really don't want to use anything else. I have existing projects using SVN, and also have extensive experience with Perforce, both of these being centralized version control systems.

So, why Git, why as a superior SVN, and so on? If you are using Subversion, or for that matter, many other choices, it is worth a serious look at Git, if at least to provide a superior solution to existing centralized version control. You can ignore the distributed version control aspects to start out. I am a strong proponent of using developer "sandboxes." My definition of this stems from our use of version control at Adobe. Put simply, a sandbox is really a developer's private branch. Those familiar with Git, Mercurial, or other distributed SCM's will immediately see the parallel. With team development, each team member works in a sandbox, and then when they have completed some amount of work that they deem suitable for the main line, or that follows with their team's checkin policies, etc., they merge their branch into the mainline (aka trunk). Doing this in SVN is fairly painful (svnmerge.py helps, but it's still weak; SVN 1.5's merge abilities may help, but it's still not even up to what SVK does). Perforce has great support for this, but it's not all that fast, and setup isn't quite as easy as Git. Also, Perforce has a locking model (i.e. to edit files, you must check them out first), which annoys me to no end after having also used SVN, etc.

A sandbox is like your own private repository, and while I don't recommend ever checking in code that doesn't compile, etc., you can if you want, and thus gain the security of your code at least being backed up/in a second location, check pointing it as much as you want, and leveraging version control, all without hosing your teammates. On larger projects at Adobe, like Photoshop, we even took this a level further, and had a sandbox for the sub-team, so you would merge your sandbox to that, your sub-team's QA would test that, and then that got pushed to main, etc.

With Git however, this "sandbox" model would be had for free, due to the distributed/decentralized model. But, do not fear, you can/do still have a central repository that is the official mainline/trunk of code! The mainline is set up as a repository, then each developer to begin work, "clones" that mainline, which creates a FULL repository on their machine (as in, not just the latest version of the code, but all history, etc.). Now, said developer can simply do their work, committing changes at will, taking full advantage of the version control system. Then, when they are ready to push their changes to the mainline/rest of the team, they simply do a push and all their changes get merged into the main repository. Very much like working on a branch/in a sandbox model, but the beauty is that you aren't having to set up a branch, you don't have to manage your branch (more painful in SVN, fairly easy in P4), and it's all VERY fast (the speed is crazy fast compared to both SVN and P4 for all these operations).

What's also cool, is that you can create branches off your own repository to do experiments or sub-projects, or isolate changes for say a bug fix, or whatever. Creating branches is so dang easy in Git that there is no reason not to do it for even the smallest thing.

Thus, it's not that you can't do any of this without Git, but Git simply makes it far easier and far faster to do this, lowering the barriers to great use of source control, and making management of your code that much better.

And, leveraging this further, you can use Git to collapse a bunch of checkins down into one. So, in this sandbox model, say you were doing a bunch of really small incremental commits, you could "squash" some or all of those prior to pushing your code into main. Here's one blog entry on this kind of thing.

Now, Git does offer one feature that I find really cool, that is not in SVN or P4 (nor any other system that I'm aware of, but of course there are many I haven't used either). This is the "stash" (see git-stash).

You can probably guess what it does. The stash allows you to take some work, and stash it away (without checking it in!) while you then work on something else in the mean time. Maybe you are trundling along on a new feature, and then something quick comes up that you need to make your top priority. Just stash your existing work away, do that new work, then apply the stash back when ready to work on that code again. The stash is like a temporary holding spot - allowing you to keep track of work, but without having to check it in. Sure, you could simply whip up a branch and check it in to that, and that certainly works, but the stash is great when appropriate.

Some of this might seem small, but as we developer's know, some of these small things can make a huge impact on your efficiency and make your day that much nicer. As said, I'm totally sold on Git, and have been converting my SVN projects to Git. I've been using GitHub as my "central" repository, or rather, the way I look at it, it's my offsite copy, or backup. But, setting up your own on a server is relatively simple as well, and you can use gitosis to manage access control and so on. Garry Dolley has a great writeup of the entire process (which is really rather short).

It might seem like a pain to change source/version control systems, but Git has tools to import an SVN repository, including all history, etc. I've used this on a relatively simple SVN repo and it worked fine - I haven't tried it on one with a slew of branches and tags, etc. Regardless, I would highly recommend checking out Git.


Don said...

I definitely agree that a move from a SCM like Subversion to Git really is a lot less painful than one might expect. You can initially use it in almost an identical manner to your SVN workflow, and start to take advantage of it's other features at your own pace.

I didn't want to just hop on the Git bandwagon, but I got hooked once I tried it.

Chris said...

Don, I agree. At first I was thinking this was one of those cases where someone invented a new tool where there wasn't a need for one, or maybe where people were just getting excited for a new tool where they didn't need one. But, as I looked at Git closer, and after hearing a few folks comment about it for specific solutions, I dove deeper. Once I saw how easy it was to branch, and how I could do this sandbox approach, etc., I was hooked. After that, now that I've been using it, the speed is definitely another big win. Lots of other great aspects too as covered here and many other places.