SQL Version Control System Design Issues

The initial release of Pyrseas is nearing completion. As noted elsewhere, Pyrseas was born to follow through on the concepts of the Andromeda Project. The idea of using a data dictionary specified in JSON or YAML was perhaps the most attractive feature of Andromeda. However, I felt there were several areas that could be improved upon. In this post I’ll go over some of the design issues that influenced the creation of the Pyrseas version control utilities.

Test Driven Development

One of the goals of Pyrseas is to encourage, support or otherwise assist in the development of test-driven applications. The tools should allow the DBA or developer to experiment with changes to database tables and other objects with almost as much ease as a file-based application programmer can change the file structures (granted, the latter has much more work overall due to the lack of a DBMS). ALTER TABLE is great to experiment and make changes, but you need something else if you also want (or need) to keep a current representation of the CREATE TABLE statement in your VCS.

In order to encourage TDD, I think it’s crucial for the tools to do their work speedily. Andromeda takes about 10 seconds to process a database change such as those I have been using as examples. The analogous Pyrseas tool completes the task in one to two orders of magnitude less time.

Team Development

Another objective is to make the tools convenient for use in a team environment. In other words, it should be easy for one DBA to make a change in her database and confidently share it –together with a set of related code changes– with her colleague via a distributed VCS. Note this does not preclude them being used by a single developer.

Modularity

Andromeda insists on a single data dictionary file. Although it requires “modules” to be defined to group tables, all user tables are created in the ‘public’ schema. It appears that Andromeda “modules” are primarily for user interface and security constraint purposes.

The initial release of Pyrseas recognizes the existence of PostgreSQL schemas and allows the DBA to place the tables in schemas as desired. Currently, the Pyrseas dbtoyaml tool (see below) can output specifications for a single schema or for selected tables. I hope to refine this capability in the future, so that –if desired– the data dictionary file could be split and stored in the VCS in a modular fashion, i.e., side-by-side or close to related application code.

Existing System Support

One aspect of Andromeda that I found limiting when I started delving into it was what happens if you want to use if for an existing database, application or project. First, it requires you to create a YAML specification of your database manually, according to its rules, for example, columns need to defined before tables and then “re-used” to specify the tables. Secondly, you can only use a limited set of data types, e.g., a ‘Y’ or ‘N’ column for a BOOLEAN.

For Pyrseas, the first tool designed was dbtoyaml which connects to a database and outputs its schemas, tables and columns in a YAML/JSON format. This format can then be input directly into the second tool, yamltodb. The latter generates SQL statements to bring a second database to match the first.

Summarizing all of the above, this allows a DBA to make a database change either through psql, pgAdmin or some other tool, and then use dbtoyaml to quickly generate a YAML dictionary which can be submitted to the VCS together with related code changes. Another DBA or programmer can pull the changes and apply them promptly to their database to test or integrate them with their own database and code changes.

Platform Support

This an area that concerns implementation but nevertheless affects design. SQL database version control tools can opt to support multiple databases or just one. Presently, I’ve decided to concentrate on PostgreSQL. I believe this can provide complete or nearly complete coverage of the many PG features, and will keep my hands full. If Pyrseas is found useful by others, I’ll be glad to assist anyone with expanding support to other DBMSs.

Development of Pyrseas was done under Linux, but since it’s written in Python, hopefully it will encounter few problems on other OSs.

Comments and discussion welcome!

About these ads

8 thoughts on “SQL Version Control System Design Issues”

  1. Do you plan to support migration of live production schemas from one version of the dictionary to a newer one while preserving/migrating the live data?

    How will you handle “modules” that may have interdependencies (say foreign keys) among them, but which you may want to version and upgrade independently?

    We tried to tackle these problems with an in house system based on direct/reverse scripts and a maven dependency based class-path management, but things get hairy quickly and you have to sacrifice either the ability to relate inter-module data or the ability to independently version and upgrade those DDL islands…

    1. Hello Juan Jesús,

      As I described above, my initial focus is supporting development. I believe the Pyrseas tools could be used –either in conjunction with something like Hubert’s (depesz) Versioning, GlueFinance POV or perhaps extensions to the tools– to preserve the live data.

      Pyrseas doesn’t know about “modules” (as Andromeda does). It does understand PostgreSQL schemas. It can handle, for example, the autodoc regression database which has multiple schemas and referential integrity constraints across schemas. As long as it makes logical sense as a unit, Pyrseas ought to (I hope :-)) be able to handle schema interdependencies. Of course, the proof will be in the testing with actual user scenarios.

  2. Hi David, Yes, I’ve been keeping an eye on PG extensions. The queries on the catalogs made by both utilities could certainly be packaged as part of an extension. The design issue becomes where to draw the line(s): return table data, return JSON/YAML, etc.

    1. Thanks Daniel. Note, however, that Randolph is a commercial product for SQL Server, not a generic SQL version control solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s