PostgreSQL Conference Brasil 2011

This Thursday and Friday I’ll be attending PGBR 2011 in Brazil (No! not that Brazil, although I do recommend the movie). Looking at the schedule, here are my tentative picks:

Day 1

Deciding on what to attend on the first slot is tough:  Bruce Momjiam talking about MVCC, Dave Page on the new SQL/MED features and Alvaro Herrera on locks and more locks. I’m inclined to go to Dave’s PostgreSQL at the center of your dataverse.

Next, we have a choice between VMWare vFabric, a case study featuring pgBouncer, and a presentation about window functions. My preference: Window Function – Por um SQL simples e melhor( (Window Function – Towards a simple and better SQL).

After lunch it’s between a review of the TODO list, followed by administrative functions, horizontal scaling with pgPool, or a round table.  I’ll probably go to the Round Table: “The PostgreSQL market in Brazil.”

After a break, the choices are: a case study on municipal usage, followed by GeoSpatial, statistics and monitoring, or benchmarking. Greg Smith’s Bottom-up Database Benchmarking gets my election.

Day 2

To start off the second day, there’s another international Postgres trio: Koichi Suzuki on Postgres-XC, Greg again on performance, and Jaime Casanova on 2ndQuadrant’s repmgr tool. The PostgreSQL Performance Pitfalls talk is the one that holds my interest.

Continuing, we have a choice between scalability and buzzwords, followed by a talk about a PG-based retail trade application, Dickson Guedes on PG 9.1 extensions, or OLTP good practices followed by a bidirectional replication case study. I think I’ll go to Dickson’s Estripando o Elefante – dividindo seus problemas em problemas menores (Gutting the Elephant – Dividing your problems into smaller ones).

After lunch the talks are: everything you wanted to know … but were afraid to ask (no, it’s about Postgres), a PL/pgSQL tutorial, and Bucardo replication followed by pgMind (a decision support tool). My selection: Dojo de programação PL/pgSQL (PL/pgSQL Programming Dojo).

The last slot offers two case studies, one on mobile messaging applications and another on municipal uses, and my choice, Meu ambiente cresceu e eu não planejei. E agora? (My environment grew and I didn’t plan. Now what?). Lightning talks follow.

As you can see, a very busy two days and a variety of topics. I look forward to meeting Brazilian and other Postgres enthusiasts. I’ll try to report on the event a week or so later.

A Multi-Layered Test Cake

Dobos layered cake*

A recurrent theme in software engineering are the multiple layers (or tiers) into which systems are subdivided. In the ideal case, a given layer only interacts with the immediate layers “above” and “below” it. In practice, that clean separation of responsibilities is not always possible or, some may argue, desirable.

Nevertheless, thinking in terms of layers is quite useful. For example, the database application I’ve been presenting in this series consists of a PostgreSQL database, psycopg2 (a Python DB-API adapter for PostgreSQL), a business logic module that calls on psycopg2, Werkzeug (a WSGI library), the application modules that call on the business logic and Werkzeug, the HTTP protocol, and the user’s web browser.

As application developers, we need not concern ourselves with the details of how PostgreSQL stores the data that we send to it, nor with testing its internal functions. We rely on PG hackers to cover that testing. Similarly, we rely on Psycopg2 programmers to test communications with PostgreSQL and the Werkzeug team to test its WSGI and HTTP-related functionality. We do have responsibility for testing the business logic, the application code, and the interface presented to the user via the browser.

Tag v0.4.1 of the Database UI Tutorial at GitHub adds a set of unit tests for exercising the business logic module (i.e., werkzeug/bl/film.py). The test module (werkzeug/tests/bl/test_film.py) is pretty straightforward: there is a test method for each method in the business logic class. In addition, there are methods for verifying failure conditions, such as duplicate inserts or deleting a missing record.

Although the unit tests were written after the business logic module (i.e., not in keeping with test-driven development), they helped to resolve an open design issue. Earlier, some methods in dblib.py (thin interface to psycopg2) took an “expected count” argument to check the number of rows affected by a database operation. That didn’t look right, so I had removed it.  While developing the unit tests, it became clear that the checks ought to live in the business logic layer.

Tag v0.4.2 adds unit tests for testing the WSGI or web interface. I was very pleased with how easy it was to use Werkzeug’s testing faciities, Client and BaseResponse, to create the tests: simply “drop” the database app instance into the Client and then call its get or post methods to get the response headers and data. I only had to add the standard library’s xml.dom.minidom to examine particular elements in the HTML body.

Both the business logic and web test modules depend on a utility module which is “reheated” from the Pyrseas version control testing. A nice capability is that the base DbAppTestCase, in its setUpClass method (a Python 2.7 feature), ensures the test database is in sync with the YAML spec.

Coming up: testing the user (browser) interface, via Selenium.


* Dobos cake photograph, courtesy of Bruce Tuten.

Dueling Frameworks, revisited

Werkzeug

In the closing remarks of “Dueling Frameworks”, I hinted that I may continue to explore Flask. That road first led me to take a closer look at the recent state of Werkzeug.

As I had done with Flask and CherryPy, the natural first step was to port the minimalist WSGI database user interface (but taking advantage of the robustness and templating (Jinja2) enhancements) to Werkzeug. This turned out to be very straightforward (see GitHub tag v0.3.2). Key changes:

  • replace my Request/Response objects by Werkzeug’s own
  • add a request argument to the various handler methods
  • simplify the dispatch methods, using request.path and request.form
  • replace the DatabaseApp.__call__ by a much simpler wsgi_app
  • eliminate serve_css (static files are served by Werkzeug’s SharedDataMiddleware)
  • use Werkzeug’s development server.

The second modification (see tag v0.3.3) was to use Werkzeug’s Map and Rule classes to refactor the URL routing. What I liked about this, compared to Flask’s, is that I could keep the delegation of routing within the FilmHandler class.

The final version —for now (see tag v0.3.4), improves the URLs to make them more RESTful. While they’re not completely in agreement with Oliver Charles comments, they’re a step in the right direction. This is something I could not achieve with CherryPy (without bringing in or writing an external module).

The bottom line is that I’ve decided to use Werkzeug for continuing with this tutorial. Perhaps what won me over was its debugger. Also, Werkzeug is a library, not a framework or even a micro-framework, so it feels like a much lighter dependency than CherryPy or Flask. If there is a concern, it is that Werkzeug lacks support for Python 3. This may make a difference —in terms of “traction”— now that WebOb has been ported, but I’m hoping Werkzeug/Flask fans will correct that in the not too distant future.

Quo vadis, Pyrseas?

When I found Andromeda, over a year ago, I was searching for a tool that would help me maintain PostgreSQL tables supporting web sites, somewhat like a Django admin app, but without Django itself since that was not part of the stack. Ideally, this tool would also facilitate table redesign. I had already written a maintenance app, using CherryPy, Mako and SQLAlchemy, but it was cumbersome to add or make changes to it, and database version control was limited to storing the latest CREATE TABLE statements in a Subversion repository.

Andromeda looked attractive because of its version control features and the capability of automatically generating admin apps. However, after experimenting, I found it lacked the ability to instropect an existing database, requiring you to input the schema in its own YAML format. Pyrseas was born as a result of thinking on how to improve this.

Fast forward to the present: Pyrseas 0.4.0 has been released, supporting about 70% of the PostgreSQL DDL features. This seems like a good time to reflect on where to go next, both short term and long term.

Database Version Control

On the version control front, the following tasks come to mind:

  • Add support for missing DDL features: TEXT SEARCH objects, USERs/ROLEs, TABLESPACEs, FDWs and EXTENSIONs.
  • Ensure existing DDL features are covered fully, i.e., all syntax options
  • Add further options to dbtoyaml/yamltodb for selective output or SQL generation
  • Package Pyrseas as a PG 9.1 EXTENSION, allowing access from other languages and tools
  • Support migration of data, i.e., when ALTER TABLE is not enough
  • Your favorite version control feature

Application Generation

Progress on this front has been slow. Pyrseas could generate minimalist web admin apps with CherryPy, Jinja2 and perhaps something like WTForms, but how many users would want to use them? Conversely, how many would also want support for HTML5, JQuery, or some other WSGI server or templating engine?

I’m not decided, but I still need that admin app, so perhaps I’ll build the capability and blog about it, for tutorial purposes, as I have done so far.

What do you think? What is your favorite database version control feature? What would you like to see in a database web app generation tool? Let me know …

P.S. I’ll be attending PGBR 2011, so hopefully I can hear from some of you personally, in English, em Português o en español.