Sunday, October 2, 2011

Generalizing version control systems

Continuing on the version control system hatred criticism topic

When VCS does not fit
On GDCE 2011 I've attended a talk of a guy from CCP Games ("Addressing Human Scalability Through Multi-User Editing Using Revision Databases" by John Rittenhouse) where he told about how they addressed the problem of the artists editing their levels \ galaxies. We in Crytek have relatively small levels, so when artist wants to make some modifications to it he just locks it for exclusive edit in Perforce. Their levels are huge and do not have a distinctive partitioning. In CCP, it is common for multiple artists to make changes to the same level simultaneously, so exclusive locks are unacceptable.

Their solution was to move out from files in Perforce and store levels in MySQL database. To stop artists from messing each-others work they have implemented the object-level locking, so, for example, if one artist moves a box on a level the corresponding row in DB will marked as locked. In level editor other artists will see that this object is being worked on and will not be able to change it. They have traded the file-level atomicity of Perforce for row-level in MySQL, well I guess you've got the idea.

Anyway, the most interesting takeaway from the talk was how they have struggled with this technology to fit it into their production pipeline. First of all, the described approach is similar to working on code without revision control at all, so they have added support for changelists, submits and reverts from the beginning. Then they had a real headache adding branching, but they have only supported a simplified "promotional branching" model where there is only one flow of changes. But their production were moving towards the "mainline branching" (see the slides) so their system could not keep up with this. And, as I understood, at the time of talk this issue was not solved yet. Basically by making this move they have condemned themselves to implement the analogue of Perforce on top of MySQL which is a damn hard thing to do.

VCS hell
In the previous post I've expressed my doubts about having separate version control systems for code and assets, related to how it affects the atomicity of changes and consistency of changelists. Essentially CCP added one more VCS for the level data. This is insane.

I'm not saying that CCP made a wrong move by doing this, in fact I can't think of any simple alternative. But having several VCS is a dead end. In fact CCP guys are aware of this, and, talking to a colleague of mine, mentioned that they had thoughts about using actually putting all their code into MySQL to use it as single VCS.

Solution (?)
So above is one example of requirements that can't be reasonably fulfilled by any of the existing version control system, but I bet it is not the only one of its kind. As an architect (well I like to think of myself as one) when making some, especially spontaneous, decision I always question it from multiple sides "Why this is best?", "How will it work with X?", "How can it be generalized?". And related to contemporary VCS following question always comes to my mind "Why do they have to operate on files?".

Choosing a file as the unit of atomicity was probably one of those spontaneous decisions made when the first VCS was born, it is just a legacy. I want to be able to customize the behaviour of VCS, want to be able to change the atomicity unit any way I need, but keep the changelists, submits, reverts and branching untouched. Leaving the file as a primary atomicity unit (I can live with that :) why not to add handlers for a specific file types which will know which sub-units it consists of, how to diff and merge them etc? Lots of opportunities are opened this way, for example you can implement a handler for a video file, allowing you to edit each frame as a bitmap image, checkout separate tracks from a music pack, lock a single box on the level at last. You can go even more crazy with hierarchical atomicity levels: basic level is a directory (you merge it by choosing which files to keep), next level is file (you merge them like you do this now), deeper levels are your custom handlers like area on level, which in its turn breaks down to objects.

Sadly, this becomes a tradition that each time I analyse some technology the conclusion is mostly "there's no such thing yet - write it yourself". Adding version control to my TODO list.

0 comments:

Post a Comment