git index explained like a magic trick
Square uses git for version control. Prior to joining I had used Perforce extensively and lightly used Subversion for home projects. Like a lot of people I struggled with git‘s steep learning curve – confusing terminology and commands with similar names to Subversion but an entirely different function (hello git checkout!). It was a rough couple of weeks but now I’ve fallen in love with it, quirks and all, and it’s become my go to version control system.
This is the first of a series of short posts on tricks I’ve learned along my git journey. To start I’m going to tackle the idiom that I had the most trouble with — the ‘staging area’ or ‘index’ (Warning: I use the terms interchangeably).
The Pledge
Most git tutorials start with something simple – add a file to git and check it in:
$ git add my_file.txt $ git commit -m 'My first commit!'
Looks reasonable — I added the file to version control. So far so good.
The Turn
Then tutorials move onto committing a change to the document:
$ echo 'Lets confuse the reader!' >> my_file.txt $ git add . $ git commit -m "I'm doing a bad job of explaining myself."
Hang on… why am I adding the file again?
Confusion has stepped in and I’m beginning to get worried. Most tutorial authors understand my confusion, they were once beginners too, and they will offer by way of explanation “git requires you to add changes to the staging area before you commit.”
Why do I need to add the change? You already know what the change is, look!
$ echo 'Why am I repeating myself?' >> my_file.txt $ git diff diff --git a/my_file.txt b/my_file.txt index 68b8348..f828edb 100644 --- a/my_file.txt +++ b/my_file.txt @@ -1 +1,2 @@ Lets confuse the reader! +Why am I repeating myself? $
There is the change, right there! I should just be able to commit it:
$ git commit -m 'A better workflow' # On branch master # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: my_file.txt # no changes added to commit (use "git add" and/or "git commit -a") $
Sorry Bub but you are going to have to git add that change and every change you will ever make. Most tutorials will show you git commit -a and will move on to “… the next exciting chapter!”. And off you go, frustrated that git forces this index on you without any foreseeable benefit, it’s just something that you learn to work around.
The Prestige
Most software engineers know that in the course of making one change they tend to introduce other unrelated changes. Let’s assume that we’ve done some work on the file and after bug fixing and refactoring it looks like this:
$ git diff diff --git a/my_file.txt b/my_file.txt index aec6da9..6de97c5 100644 --- a/my_file.txt +++ b/my_file.txt @@ -1,4 +1,8 @@ Lets confuse the reader! +This is part of a bug fix Kirk Picard +A bit more bug fixin' Janeway +Refactor, refactor. +Some more refactoring. $
If you use a different VC system you are left with a handful of choices none of which are ideal [1]. What I really need is a way to select which changes go into the commit. Oh look at this… git add has this --patch switch.
Ta da!
When you use git add --patch you put git into an interactive mode where for each diff hunk if you want to add it to the staging area. Why is this useful? Because, dear reader, it allows you to compose your changes prior to commit. This way you can keep your refactoring changes and put them in a separate commit without having to back them out first.
This first time I saw git add --patch in action I immediately understood the power of the staging area. Why don’t all version controls have this system? It’s been a part of my workflow ever since, in fact I used it so much I made it into a bash alias:
$ alias gap alias gap='git add --patch'
So now you know.
Postscript
git supports the --patch switch on many of its commands, read the docs and find out. The others I use it the most on: checkout, stash and log.
[1] You can: 1) skip the refactor for now and come back later, 2) have this commit include a whole set of unrelated refactoring, 3) move the refactor changes away, finish the original work and then commit the refactor, or 4) create a new client and port the changes over. I’m going to be blunt, brace yourselves — these all suck.
