Version Control, Part I: Pre-SQL

A version control system (VCS) such as Git or Subversion is an indispensable component of modern software projects. Many applications use a relational database management system (DBMS) such as PostgreSQL. While a DBMS facilitates an application’s data management tasks, it presents some version control challenges.

To understand the version control issues we’ll examine a project with two programmers, Alice and Bob, developing an application loosely based on the Pagila sample database. We’ll start our tour with an application predating the use of SQL.

When developing a file-based application in C, C++ or other programming languages, you typically define the data structures in your source code, e.g., in header files that are included or otherwise referenced by the application code.

Version 0.1

For example, a C application to maintain information about movies may start off by defining the following in a file named film.h:

typedef struct
    int id;
    char title[32];
    int release_year;

The typedef may represent the actual layout of the file records as well as the application buffers used to hold the data input by users or to be displayed on screen.  In addition, the application will need code to ensure the uniqueness of the film identifiers, e.g., to maintain a file index, and to validate the release year upon input, e.g., release_year >= 1888.

The film.h file and the rest of the application are submitted to a VCS and are released as version 0.1.

Version 0.2

For the next version, Bob adds short length; to the end of the FILM structure.  He also adds help text to tell users the new field is to capture the length of the movie in minutes and code to validate that length > 0 and length < 10000.

Separately, Alice adds char rating[6]; to the end of the struct, application help text explaining this is for the MPAA rating, and code to validate the input against a list of known values.

Finally, Bob and Alice submit their changes to the VCS and either he or she (or an integration manager) merge the changes and settle on the following structure:

typedef struct
    int id;
    char title[32];
    int release_year;
    short length;
    char rating[6];

The VCS allows Bob and Alice to keep track of changes to the FILM structure, e.g., to diff the tagged revisions of film.h:

$ git diff 0.1 0.2
diff --git a/film.h b/film.h
index 8670425..1df5aaa 100644
--- a/film.h
+++ b/film.h
@@ -5,4 +5,6 @@ typedef struct
     int id;
     char title[32];
     int release_year;
+    short length;
+    char rating[6];
 } FILM;

To allow users to migrate existing applications, a conversion program is also needed to reformat the data file.

Version 0.3

For the third revision, Alice adds a genre array to FILM, and a CATEGORY structure to allow users to maintain the allowed values for genres.  In the meantime, Bob adds a language field to the FILM structure, together with an application-maintained array of language codes and names.  Once merged and released, the structures look as follows (with comments):

typedef struct
    int id;            /* genre/category ID */
    char name[26];     /* genre name */

typedef struct
    int id;            /* film ID */
    char title[32];    /* film title */
    int release_year;  /* release year */
    short length;      /* length in minutes */
    char rating[6];    /* MPAA rating */
    int categories[5]; /* film genres */
    int language;      /* spoken language code */

The conversion program has to be updated to migrate to the expanded structure, but the VCS still supports comparison of different revisions.  You can see it also allows each programmer to work independently on unrelated features and a distributed VCS such as Git can keep track of the merge actions, as seen below—version 0.2 was merged by the project manager whereas for version 0.3 Bob pulled Alice’s change before committing:

$ git log --pretty=format:"%h %s" --graph
* eb0a385 Add language.
* e6568c5 Add film genres.
*   f404fe2 Merge Alice's add rating.
| * bdc8247 Add MPAA rating.
* | 5772f38 Add length.
* 76ecd04 First version

With this preliminary flat-file scenario established, we’re ready to move to our next installment where we will contrast the above to the situation using a SQL DBMS.

One thought on “Version Control, Part I: Pre-SQL

  1. Pingback: Version Control, Part 2: SQL Databases | Taming Serpents and Pachyderms

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.