Wednesday 16 December 2009

We don't need SCM on the desktop.

The current crop of distributed SCMs have brought new power to software development. Using them for the first time is a liberating experience - now I can keep track of what I've been doing, not just what the team has been doing. No longer am I tempted to check something half-complete into the server, just so that I can have a revision-controlled version of it off-site in case I rm -rf the wrong project.

Distributed SCM has brought us great freedom, but it's not enough.

A public repository deals with revision control. Essentially it's a bunch of patches, each of which hopefully represents one new feature or bugfix. They form a delicate yet inexorable tide of progress, a steadfast adventurer boldly seeking El Dorado, that mythical place where functionality is balanced perfectly against usability, and performance and robustness combine.

In this bunch-of-patches model, it is important that everything be append-only. Many people are reading these patches, and future patches rely on the state not changing. Therefore we must do nothing to upset the patches, or they will send their dreaded minions Merge Conflict against us. For this reason both Git and Mercurial are essentially append-only packages - patches go in, and then never come out again.

Sure, you can edit the history, but there's always a feeling that it's a bit dirty... commands exist, but they're usually extensions, and have big caveats attached to them. The feeling is that this is something of a last resort, so it's ok if the tools for it aren't straightforward to use.

Rubbish I say! A personal repository is a vastly different beast to a public one! The goal of a local repository is to act as a cauldron, in which noxious reagents bubble away together, mixing away until they finally form a potent new elixir. At this point they are rapidly bottled and placed on the shelf.

I'm the kind of coder who is always in the middle of some change, when I notice a small bug in something unrelated. It might even just be a misleading comment. At this point a common reaction is just to fix it and check it in with whatever I'm doing. However as a bit of a perfectionist this is a bit unsatisfactory.

With say Mercurial, I can now address this issue in a nicer way. I can check in what I'm doing, hg up back to the previous checkin, make the change, and go back to what I was doing. Later on I can recombine the two halves of the change using a combination of transplant, revert, commit and strip.

But shouldn't this be trivial? I should just be able to recombine these changesets at the drop of a hat. I should be easily able to use a common merge tool to select the changes I want in each changeset. I should be able to drag changes around in some visual undo-tree editor to recombine them, remove them and what have you.

In general the idea of changesets or patches isn't really relevant on the desktop. Whatever changes I'm making only become a patch or changeset once everything's done. Up to that point, it's easier to think of things as a stream of edits, none of which can be considered a logical patch. To me adding a 'commit' in this mode should be as easy as pressing ctrl-S - I shouldn't have to provide a name yet, I'm not even sure what effect the final change will have until it's all done. Once I've done all the parts, tested it on various platforms, broken all the work up into discrete changes, then I will give them all names and submit them.

So, as I said, we don't need SCM on the desktop - we need a source code EDITOR. A proper one, that understands directories and history - perhaps it uses an anonymous git repository in the background to manage the changes. To my mind all this is the job of an editor, and I should be able to use it without even realizing that I'm using a repository. I can then submit patches to whatever SCM the rest of the team is using, and everyone will congratulate me on my nice clean patches that just do what they say.