The Phantom of the Database – Part 1

Scenario: A row with “Seven  Samurai” (notice two spaces between the words) as the movie title and 1956 as the release year, in the film table.

Plot: User Alice updates the row to remove the extra space. In the meantime, user Bob updates it to change the title to its phonetic Japanese equivalent “Shichinin no Samurai” and user Carol updates the release date to 1954, the year the movie was first shown in Japan. Whose updates will survive?

If the updates take place sequentially and each user fetches the previously updated row prior to saving his or her change, we should end up with a row with “Shichinin no Samurai” as the title and 1954 as the year.

However, if each user starts viewing the original row in a web browser and the application is incorrectly designed, one or two changes may be overwritten.

For example, my minimalist user interface application even in its more robust incarnation, issues an UPDATE with both the title and the year as received from the user. This was done intentionally, for simplicity, to be corrected later. As a result, if the users press the “Save” button in the sequence shown, both Alice’s correction to the English title and Bob’s change to a Japanese title will be lost because they will be overwritten by Carol’s change.

If you’ve used a centralized VCS, like Subversion, you’re probably familiar with the issue. If our users were developers updating a source file with the movie information, they would (typically) be prevented from overwriting because the VCS would inform the second and third users that they had to merge the changes in prior commits.

So, is it enough if the application only updates the columns that were changed by the users, somewhat like a VCS automatically merging non-conflicting changed lines?

To be continued …

4 thoughts on “The Phantom of the Database – Part 1

  1. Whether or not an application should update the entire row or just changed fields is an interesting question. The answer is that it depends on the atomicity of the data. It is entirely possible to view the ‘entire row’ as the appropriate level. In fact, using this example, I would argue that changing the title and year independently is NOT a good idea. Arguably the US release should be viewed as a single event and so should have the English title and the US release date, whereas the the Japanese release should have the Japanese (phonetic) title and the Japanese release date. It is desirable to keep those things synced, and updating the entire row allows for this.

    On the other hand, communication is a very important tool, so it is probably desirable to see if the row had been updated since being retrieved to the application and if so, show the changes to the user(s) to let them make an informed decision. Adding a communication channel or contact information might also be useful.

    Aside: it seems obvious from this example that the schema does not appropriately reflect the domain, and that you really need a new table ‘Releases’ which contain Title and Year and possibly other values like Region and Medium (eg: dvd/theatric/vhs) and some way of describing the differences if any (eg director’s cut/unedited/subtitled).

    • Thanks for your comments and interest. The schema is an extremely simple example that I used earlier in my series on version control for SQL databases. It’s supposed to reflect a movie as present in IMDb. I’m developing a tutorial application and using it to raise certain app dev and database issues. In a real application, for example, you’d have to decide whether to keep the original title (in Japanese), the phonetic version, the English translation or any number of local titles used in various countries (as IMDb does). The tutorial is more like a personal movie database, or like Pagila, a retail store database, where all these details would not be of interest.

  2. Pingback: The Phantom of the Database – Part 2 | Taming Serpents and Pachyderms

  3. Pingback: The Phantom of the Database – Part 3 | Taming Serpents and Pachyderms

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.