SQL Database Version Control – depesz Versioning

In the previous article, we reviewed how a difference engine, as implemented by apgdiff, in conjuction with a VCS, can assist in controlling upgrades to a SQL database,

In “How to manage changes to your database?” Hubert “depesz” Lubaczewski explains why the schema diff approach hasn’t worked for him. His primary concern is that a development database tends to have extraneous objects, e.g., secondary indexes, other tables in a non-finalized state. Hubert goes on to introduce his own Versioning project designed to aid in managing database “patches.”

Today we’ll explore how Carol and Dave could use Versioning to deal with the changes described in Version Control, Part 2: SQL Databases.

Version 0.1

The Versioning tool is complementary to the overall process of developing a database application. It is more concerned with getting the deployment correct rather than helping the programmers or DBAs with figuring out what needs to be changed for the current release.

After installing Versioning, for the first release, Carol needs to create two scripts, film.sql:

BEGIN;
  SELECT _v.register_patch('film');
  CREATE TABLE film (
      id INTEGER NOT NULL PRIMARY KEY,
      title VARCHAR(32) NOT NULL,
      release_year INTEGER NOT NULL CHECK (release_year >= 1888)
  );
COMMIT;

and rollback-film.sql:

BEGIN;
  SELECT _v.unregister_patch('film');
  DROP TABLE film;
COMMIT;

The argument to the register_patch and unregister_patch functions (created in the Versioning schema _v) is the name given to the patch. The film.sql script is executed first. If there are any problems after installation and testing, rollback-film.sql can be used to remove the patch. The patches table records what patches have been installed:

moviesdb=> select * from _v.patches;
 patch_name |          applied_tsz          | applied_by | requires | conflicts
------------+-------------------------------+------------+----------+-----------
 film       | 2011-02-18 12:51:06.780811-05 | carol      | {}       | {}
(1 row)

Version 0.2

For the second version, Dave has to create two other scripts:

BEGIN;
  SELECT _v.register_patch('film-length', ARRAY['film'], NULL);
  ALTER TABLE film ADD COLUMN length SMALLINT
      CHECK (length > 0 AND length < 10000);
COMMIT;
BEGIN;
  SELECT _v.unregister_patch('film-length');
  ALTER TABLE film DROP COLUMN length;
COMMIT;

Carol also creates two scripts for her change:

BEGIN;
  SELECT _v.register_patch('film-rating', ARRAY['film'], NULL);
  ALTER TABLE film ADD COLUMN rating CHAR(5);
COMMIT;
BEGIN;
  SELECT _v.unregister_patch('film-rating');
  ALTER TABLE film DROP COLUMN rating;
COMMIT;

The second argument to the register_patch function is an array of requirements, i.e., patches that should have been installed prior to the current one. If required patches are missing, Versioning rejects the patch with an error:

moviesdb=>   SELECT _v.register_patch('film-length', ARRAY['film'], NULL);
ERROR:  Missing prerequisite(s): .

Versioning also provides a shell script, list-dependencies-from-patches.sh, that can be used to determine the order in which patches should be applied to avoid such dependency errors.

The third argument to the register_patch function is an array of conflicts, i.e., patches that would conflict with the current one.

Version 0.3

For the third release, it should be obvious that Carol and Dave have to create two new scripts each: an installation script and a rollback script. The installation scripts contain the same SQL statements as shown under Version 0.3 in the previous article but preceded by a SELECT _v.register_patch() and wrapped in transaction statements. The rollback scripts consist of SQL statements to undo the effect of the installation statements, again together with an unregister_patch call and wrapped by BEGIN/COMMIT, e.g., here is Dave’s rollback-film-language.sql:

BEGIN;
  SELECT _v.unregister_patch('film-language');
  ALTER TABLE film DROP COLUMN language_id;
  DROP TABLE language;
COMMIT;

Review

Versioning is simply a set of SQL scripts so it should run on any supported PostgreSQL environment, but the list-dependencies-from-patches.sh script requires bash. It needs PL/pgSQL so you’ll have to create the language in any pre-9.0 database if not present.  With Versioning, the VCS will store the patch installation and rollback scripts, but not necessarily a representation of the overall database, although that could be satisfied by storing the pg_dump -s outputs at each release.

To determine which SQL statements to include in the patch scripts, Carol and Dave will have to (a) use Versioning in their own development databases, (b) keep a record of the database changes as they make them, or (c) inspect the differences between the development and production databases, perhaps using a tool such as apgdiff. The first option seems to preclude the use of developer-friendly tools such as pgAdmin to modify table structures. Most options require fairly knowledgeable database developers since they have to understand not only how to make a change but also how to undo it, including being aware of dependencies between database objects.

On the other hand, use of Versioning during development will tend to ensure that changes to table structures are adequately tested–a concern voiced by Kenneth Downs. The Versioning tool can also help with complex upgrade scenarios such as renaming tables or columns without losing existing data, splitting a column into two, or fancier database refactorings.

About these ads

4 thoughts on “SQL Database Version Control – depesz Versioning”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s