Skip to content

git index explained like a magic trick

Square uses git for version control. Prior to joining I had used Perforce extensively and lightly used Subversion for home projects. Like a lot of people I struggled with git‘s steep learning curve – confusing terminology and commands with similar names to Subversion but an entirely different function (hello git checkout!). It was a rough couple of weeks but now I’ve fallen in love with it, quirks and all, and it’s become my go to version control system.

This is the first of a series of short posts on tricks I’ve learned along my git journey. To start I’m going to tackle the idiom that I had the most trouble with — the ‘staging area’ or ‘index’ (Warning: I use the terms interchangeably).

The Pledge

Most git tutorials start with something simple – add a file to git and check it in:

$ git add my_file.txt
$ git commit -m 'My first commit!'

Looks reasonable — I added the file to version control. So far so good.

The Turn

Then tutorials move onto committing a change to the document:

$ echo 'Lets confuse the reader!' >> my_file.txt
$ git add .
$ git commit -m "I'm doing a bad job of explaining myself."

Hang on… why am I adding the file again?

Confusion has stepped in and I’m beginning to get worried. Most tutorial authors understand my confusion, they were once beginners too, and they will offer by way of explanation “git requires you to add changes to the staging area before you commit.”

Why do I need to add the change? You already know what the change is, look!

$ echo 'Why am I repeating myself?' >> my_file.txt
$ git diff
diff --git a/my_file.txt b/my_file.txt
index 68b8348..f828edb 100644
--- a/my_file.txt
+++ b/my_file.txt
@@ -1 +1,2 @@
 Lets confuse the reader!
+Why am I repeating myself?
$

There is the change, right there! I should just be able to commit it:

$ git commit -m 'A better workflow'
# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	modified:   my_file.txt
#
no changes added to commit (use "git add" and/or "git commit -a")
$

Sorry Bub but you are going to have to git add that change and every change you will ever make. Most tutorials will show you git commit -a and will move on to “… the next exciting chapter!”. And off you go, frustrated that git forces this index on you without any foreseeable benefit, it’s just something that you learn to work around.

The Prestige

Most software engineers know that in the course of making one change they tend to introduce other unrelated changes. Let’s assume that we’ve done some work on the file and after bug fixing and refactoring it looks like this:

$ git diff
diff --git a/my_file.txt b/my_file.txt
index aec6da9..6de97c5 100644
--- a/my_file.txt
+++ b/my_file.txt
@@ -1,4 +1,8 @@
 Lets confuse the reader!
+This is part of a bug fix
 Kirk
 Picard
+A bit more bug fixin'
 Janeway
+Refactor, refactor.
+Some more refactoring.
$

If you use a different VC system you are left with a handful of choices none of which are ideal [1]. What I really need is a way to select which changes go into the commit. Oh look at this… git add has this --patch switch.

Ta da!

When you use git add --patch you put git into an interactive mode where for each diff hunk if you want to add it to the staging area. Why is this useful? Because, dear reader, it allows you to compose your changes prior to commit. This way you can keep your refactoring changes and put them in a separate commit without having to back them out first.

This first time I saw git add --patch in action I immediately understood the power of the staging area. Why don’t all version controls have this system? It’s been a part of my workflow ever since, in fact I used it so much I made it into a bash alias:

$ alias gap
alias gap='git add --patch'

So now you know.

Postscript

git supports the --patch switch on many of its commands, read the docs and find out. The others I use it the most on: checkout, stash and log.

[1] You can: 1) skip the refactor for now and come back later, 2) have this commit include a whole set of unrelated refactoring, 3) move the refactor changes away, finish the original work and then commit the refactor, or 4) create a new client and port the changes over. I’m going to be blunt, brace yourselves — these all suck.

New job!

February 2nd was my final day as a Google employee. That same day my wife and I left for a brief vacation in Costa Rica, and on my return I started as an engineer at Square. One month in and I’m loving it, very happy that I made the change.

Indoor Street View

This morning Google announced it’s Google Art Project (launch blog post). This is a really exciting project, utilizing Google’s Street View technology to bring art in front of a world-wide audience.

For a while I was working on a 20% project with the team to store and render the building floor plans. On the surface this is a well defined problem – get floor blueprints from building owners, digitize analog sources, convert into some canonical representation and upload them into a data store. Presto! Problem was the quality of the source data varied dramatically. If we were lucky we got AutoCAD files from the architects, but more often than not we were unlucky. A lot of building owners couldn’t find the blueprints, and one even sent us a camera phone picture of the fire escape plan mounted on the wall.

More teams got wind of what I was working on and wanted to generalize the work to fit into their projects, so becoming stake holders. Eventually it reached the point where my role had grown from being a 20% engineer solving technical problems, to becoming a full-time PM managing several different teams. I had to pull out because it was disrupting my work on my main team (Google Maps). The Art team ultimately came up with a very clever way of generating floor plans that didn’t depend on data from building owners.

Since then I’ve kept tabs on the project and watched it progress into the product that was launched today. I’m glad to have had at least some involvement with the project, and the opportunity to work with Jonathan Siegel, lead engineer, and Daniel Ratner, lead mechanical engineer. All the best with future work on the product!

A game in HTML5

[It's been a while since I updated my blog, and even though I have low readership I should blog more to improve my writing skills. Write that one down as a New Year's resolution.]

In October of last year Google held an HTML5 Game Jam to spur game development in HTML5. I decided to attend the San Francisco office event and paired up with fellow Googler Mark Ivey to hack together a game. Our game isn’t a stunning opus or an exciting new mechanic, rather it’s the result of two engineers who started to code with no idea of what we were going to build.

The jam ran for two days, but I was only able to attend the first day. By the end of it we had something that began to look like a game – a scrolling landscape, independently moving player characters and a damage system. Mark came back on the second day to tie up some of the loose ends, adding a game over screen, some visual polish, gameplay instructions and bug fixes.

Oh, and we released the source code. Feel free to hack on it, feedback welcome.

Raytracer source code

I’ve put the source to my JavaScript raytracer up on GitHub. The version up there is more recent that the published demo, but most of the differences are not visible.

I grew up using more traditional SCM software – Microsoft’s Visual SourceSafe [1], CVS and Perforce – so I find myself on unfamiliar ground with git. This is my outing but I’m curious to see whether it’s model is useful for small projects.

[1] We preferred to call it “SourceUnSafe” as it did a pretty poor job at source code (lack of atomic commits, corruption, database bloat). Apologies for the programmer humor. To be fair we wouldn’t have shipped the project without SCM but SourceSafe came close to sinking us on more than one occasion. After ship we switched to Perforce and I’ve never looked back.

Correction: JavaScript logical XOR and parity

In an earlier post (which I’ve since deleted to prevent incorrect knowledge from spreading) I claimed that JavaScript did not have a logical XOR operator and explained how to compute the parity of numbers. Turns out I was wrong on the XOR operator, and there is a simpler way to compute parity.

Nathan Lucash pointed out that JavaScript does indeed have an XOR operator, ^ (just like C), and it works just fine:

> for (var i = 0; i < 2; i++) { for (var j = 0; j < 2 ; j++) { window.console.log(i ^ j); } }
0
1
1
0
>

On the issue of parity it turns out I was thinking too much like a statically typed C programmer — I assumed that since JavaScript’s only numerical type, Number, is implemented with IEEE floating-point I discounted the approach of checking the bottom bit of the number. To be honest I think that is a reasonable assumption, but I should have been a little more na├»ve:

> 1 & 1
1
> 2 & 1
0
> -3.14156 & 1
1

The code above uses the bitwise and operator (&) to isolate the lowest bit (numerical value of 1). I’m not sure what is happening under the covers, but it looks like JavaScript casts Numbers to an internal Integer type and then performs the bit operation. When I have more time I’ll look into the V8 source.

[UPDATE: 10/14] I found two places in the V8 code where it casts Numbers to Integers prior to doing a logical XOR. One is in C++ code, and the other is in some JavaScript code. It’s not clear to me yet why there are separate implementations, or what the role of the JS code is in V8. I suspect these operations are very slow though, as the last time I benchmarked double <-> int conversions (on the Pentium 2) they were hellishly slow. It would make an interesting benchmark, for a future post perhaps.

A raytracer in JavaScript

JavaScript Shiny!

I’ve spent a little time on and off over the last few weeks implementing a simple raytracer in JavaScript. The implementation of the floor inspired the post about JavaScript’s lack of a logical xor.

I wrote the raytracer to explore HTML5′s canvas capabilities and the computation performance of browser’s JavaScript engines (such as Chrome’s V8 engine). I’m also experimenting with various design ideas and would like to try out HTML5′s Web Workers.

I plan to release the source code when it’s in a better shape, I’ll post about it when I do. Until then, you can poke through the JS files downloaded by the page.

JavaScript logical XOR and parity

[UPDATE: 10/13] The information in this blog is incorrect and mis-leading. I have written a new post with corrections. Apologies.

I recently had the need for a logical XOR (eXclusive-OR) function in JavaScript. Unfortunately JavaScript doesn’t provide one, but from the operator’s truth table it’s pretty easy to implement:

function xor(a, b) {
  return (a && !b) || (!a && b);
}

My use case required me to XOR the parity of two integers. Parity is a fancy way of referring to whether an integer is even or odd. In languages that support Integers and bitwise operators this is trivial because in the binary representation of an integer, the parity is stored in the least significant bit. Here’s an example in C:

bool res = (a & 1) ^ (b & 1);  // ^ is C's XOR operator.

JavaScript however lacks both a specific Integer type and also no bitwise operators. So I had to implement the two functions by hand. The approach I chose was to use the equality that dividing an odd number by two yields a fractional part, specifically 0.5. JavaScript also lacks a built-in method to return the fractional part of a Number, so I had to implement that as well.

/**
  * Returns the fractional part of a number.
  */
function frac(x) {
  return x - Math.floor(x);
}

/**
  * Returns true if the number is even, false otherwise.
  * Assumes that x is a floor value, i.e. that is has no fractional
  * part.
  */
function even(x) {
  return frac(x/2) < 1e-2;  // use an epsilon
}

One final comment about the even() function: when comparing floating-point numbers you should avoid exact comparisons. This is why I do not compare the result of the frac() function to 0. Instead I use a fuzzy comparison, and check that the returned value is less than 0.01. This epsilon is fine because frac() will either return 0 or 0.5 (or approximations there-of).

Ahead of the Curve

I recently came across a video for a new Ubisoft game called From Dust. It was co-announced at the recent GDC Europe by the lead designer and engineer who gave presentations on it. I’ve embedded the tech demo video below:

What struck me as I watched this video was how similar it was to a project that we were working on at Bullfrog in 1999. The project was called Genesis and the similarities between the two games are uncanny: real-time landscape editing, a water simulation, a god’s eye view of the landscape and people walking around. From Dust looks a lot prettier than our prototype did though!

The idea of Genesis was to explore a new direction for Populous, one based around indirect control. Most games are direct control — you tell NPC’s (non-playable characters) explicitly where to go, and what to do when they got there. With an indirect control mechanism you cannot micromanage individual units, but instead you can only influence where they go. We hoped to achieve this influence through a variety of different means — the shape and nature of the landscape, natural resources (water, food, etc.), other tribes, etc. This seemed to fit the player’s role as a deity very well, and has the advantage that you can manage many more units than is possible with direct control.

Unfortunately after about 5 months of work, the project was cancelled and the team disbanded. While I was skeptical that the indirect control system could be made to work, I was a little disappointed as we were exploring interesting game mechanics and technologies. A total of 6 people worked on the prototype – Gary Stead (lead engineer, game simulation), David Bryson (rendering and water simulation), myself (rendering, terrain LOD and data pipelines), Alex Godsill (animator, modeler and concept artist), Ernest Adams (designer) and a level designer called Dan, whose last-name escapes me right now. Apologies Dan!

The water simulator was a direct implementation of M. Kass, G.Miller, “Rapid, stable fluid dynamics for computer graphics”. The terrain LOD system was influenced by P. Lindstrom, et al. “Real-Time, Continuous Level of Detail Rendering of Height Fields”.

I’m going to be watching From Dust with great interest, hoping that they can prove the tech and gameplay mechanics.

My demos could beat up your demos

In the mid 90′s I used to be fairly active in the PC demoscene. It was a heady mix of images, sound and impressive technical chops. There was something fun about working within the tight constraints of then current PC hardware – an 8-bit image display measuring 320×200 pixels (or 320×240 if you cared about your pixels being square), a 386Mhz processor, and a Gravis UltraSound sound card.

In the early days demos were really a way of showing off people’s graphical programming skills, but with the introduction of the GPU some of those skills have fallen away. The effect of the GPU on the demo scene cannot be understated, bringing perspectively correct texturing mapping, bilinear filtering, “true-color” blending and unpacked RGB image formats. But this performance was achieved by severely limiting the flexibility of the rendering pipeline. For games (the primary driver of GPU sales) this was fine, but demos had thrived on inventive ways of rendering, defining the style of a demo or group. So demos become more like 3D object slideshows, and less like this:

you am i you am the robot / Orange

To make a bad and overly lofty analogy, it’s like giving a painter a larger color palette but then telling them they could only paint in straight lines. In the last few years GPU’s have opened up their rendering pipelines allowing for more variety, but the scene hasn’t regained that sense of style and presentation that was present in the mid 90′s. Which brings me to the original motivation for this post — I just watched the top two demos at this years Assembly contest, and found myself thoroughly bored. Judge for yourself:

1st place – “Happiness is around the bend” / Andromeda Software Development

2nd place – “Ceasefire (all fall down…)” / CNCD vs Fairlight

The winning demo is impressive for the production quality and having some consistent thematic theme, but it’s overlong and too grandiose for me. It sort of looks like a long Maya demo reel. The CNCD demo opens with a nice effect and has an impressive point/particle rendering technique that thankfully lends a distinctive look, but for me it fails to exploit the technique fully. With the exception of the building explosion sequence, it’s just scene after scene rendered as particles. Don’t get me I wrong, I think the tech is great, and I’m a fan of anyone who can write a fluid solver, but the demo lacks… personality.

Yawn. Maybe I’m just a curmudgeon who constantly compares everything to the ‘good old days’? Still, every time I watch a demo it does make me want to do something again :)