Tag Archives: postgresql

Testing Python and PostgreSQL on Windows, Part 5

I’ve got the Perl on Windows blues …

Aside from PL/pgSQL, the base distribution of PostgreSQL supports three procedural languages: Perl, Python and Tcl. When creating Pyrseas unit tests for languages (before they became EXTENSIONs), PL/Perl seemed like the “natural” choice. Perl is available on virtually all Linux and BSD base distributions, and (in contrast to Python) PL/Perl is available in both trusted and untrusted versions—a distinguishing attribute that the unit tests ought to check.

Alas, when testing on Windows I discovered that Perl isn’t as ubiquitous or as easy to deal with as it is on Linux.

The preliminary research prompted me to install Active Perl, specifically its Community Edition. Currently, the only two versions available are 5.14 and 5.16. Strangely enough, if you install one of those versions and then attempt to create the PL/Perl language under PG 8.4 or 9.0, the command succeeds, but when you try to create a function using plperl, you’ll see something like:

ERROR:  could not load library "C:/Program Files/PostgreSQL/8.4/lib/plperl.dll":
 The specified module could not be found.

Against PG 9.1 and 9.2, if you installed Perl 5.16, you’ll see the error message when issuing the CREATE EXTENSION (or LANGUAGE) statement. If you installed Perl 5.14, there should be no error.

When I first saw the error message, I was a bit puzzled since the plperl.dll libraries had all been installed and were located in the paths show in the messages. What “specified module” was missing?

Some web searching pointed me to Dependency Walker (depends.exe), a tool that appears to be indispensable if you’re going to be testing with multiple executable and DLL versions. It is analogous to Linux ldd. Depends.exe showed that plperl.dll in the 8.4 and 9.0 installations was linked with PERL510.DLL and in 9.1 and 9.2 with PER514.DLL.

Unfortunately, Active Perl has no Perl 5.10 Community Edition available, so off I was looking for an alternative. Thus I found Strawberry Perl.

The downside of Strawberry Perl’s installers is that they install it in C:\strawberry so you can’t have both Perl 5.10 and 5.14 at the same time. Someone on IRC explained that it is possible to install it in two separate paths (but it ain’t easy). For now, I chose to only install Perl 5.10. This allowed me to test Pyrseas using Python 2.7 and 3.2 against PostgreSQL 8.4, 9.0, 9.1 and 9.2, with only one Perl-related test faling (under PG 9.1 and 9.2, due to the absence of Perl 5.14).

A note of caution: Strawberry Perl installs GCC (3.4.5 in the Perl 5.10 version). If you have a pydistutils.cfg specifying a mingw32 compiler (as mentioned in my previous post), that may cause problems if you try to install or upgrade psycopg2 (or some other C extension module).

Testing Python and PostgreSQL on Windows, Part 4

At the end of Part 2, I suggested those who were anxious to start testing could try python tests\dbobject\test_schema.py right after installing psycopg2, and implied everything would work just fine by showing the output of a successful run.

That was deceptive because before running the Pyrseas tests you need to create a PostgreSQL user. So you first need to connect using psql or pgAdmin, as the postgres user and issue a command such as the following:

CREATE USER username CREATEDB SUPERUSER password;

Of course, username and password should be your user name and a suitable password, respectively. While you’re at it, you can also create the two testing roles:

CREATE ROLE user1;
CREATE ROLE user2;

The user username needs the CREATEDB privilege to create the Pyrseas test database, by default, pyrseas_testdb. Alternatively, the test database could be created first, e.g., by postgres and owned by username. Also, if username isn’t granted the SUPERUSER privilege then the tests requiring it will be skipped.

One last detail before being able to run the tests: create a PostgreSQL password file, i.e., %APPDATA%\postgresql\pgpass.conf. The format is straightforward:

*:*:*:username:password

Make sure you have the latest pyrseas/testutils.py. It includes a change to make the tests portable to Windows. You’re now ready to give all the tests a whirl:

python tests\dbobject\__init_py

or you may prefer to use test discovery:

python  -m unittest discover -s tests/dbobject

One of the first issues you’ll notice is when running test_tablespace.py, which exercises support for PostgreSQL tablespaces. According to the documentation, “tablespaces can be used only on systems that support symbolic links” (aside: thanks to Andres Freund and Merlin Moncure for answering questions about this on IRC).

It seems Windows Vista (or even XP) may have something akin to symbolic links (junctions?). Nevertheless for Pyrseas, it is sufficient to create directories for the tablespaces, e.g.,

C:\>md c:\somedir\pg\9.1\ts1
C:\>md c:\somedir\pg\9.1\ts2

However, when you try to define the tablespace from psql, you’ll probably see:

postgres=# CREATE TABLESPACE ts1 LOCATION E'C:\\somedir\\pg\\9.1\\ts1';
ERROR:  could not set permissions on directory "C:/somedir/pg/9.1/ts1": Permission denied

The problem is that the directory needs to be owned by the postgres user. Unfortunately, there is nothing equivalent to chown on native Windows XP Home. Andres had mentioned cacls and after some web searching and trying, I came up with the following commands as the nearest approximation to chown:

cacls c:\somedir\pg\9.1\ts1 /E /G postgres:F
cacls c:\somedir\pg\9.1\ts2 /E /G postgres:F

These give the postgres user FULL owner privileges over the directories. If you invoke the CREATE TABLESPACE statements again, then they should succeed … Well, at least on PostgreSQL 8.4, 9.0 and 9.1, the cacls commands allow the tablespaces to be defined. However, under 9.2 the “Permission denied” error persists (still haven’t figured out why—if you know why, do leave a comment).

Update: Thank to alemarko’s comment below, I figured out that for PG 9.2, assuming you installed with the defaults, the correct incantations  to use are:

cacls c:\somedir\pg\9.2\ts1 /E /G networkservice:F
cacls c:\somedir\pg\9.2\ts2 /E /G networkservice:F

Testing Python and PostgreSQL on Windows, Part 3

As a commenter mentioned in response to Part 2, an alternative to using pip install psycopg2, which requires that you first install VC++ 2008 Express, is to download and install the Windows port, aka win-psycopg. Jason Erickson makes these builds available for several versions of Python for both 32- and 64-bit Windows.

If you’re not planning to use virtualenvs or tox (which creates virtualenvs for you), then the win-psycopg installer is the easiest way to satisfy the psycopg2 dependency (in fact, that’s how I started after the initial failure with pip). Simply download the Python 2.7 and 3.2 installers, run them and you’re done. However, the installers don’t work in a virtualenv (there is a workaround to extract the files, but I didn’t explore it because I wanted to use tox).

Another option is to build psycopg2 with the MinGW compiler. Daniele Varrazzo has a post describing this. Daniele used a special MinGW package, but I chose to install the latest (mingw-get-inst-20120426.exe) from MinGW.org (click on the Navigation – Downloads link which will take you to SourceForge).

To get the psycopg2 sources, you can download the .tar.gz package, or, since you have Git, do this from Git Bash:

git clone git://luna.dndg.it/public/psycopg2.git
cd psycopg2
git checkout 2_4_5  # or latest tag

Create a pydistutils.cfg file in your home directory (%USERPROFILE%) with the following (an alternative is to use the --compiler option to the python setup.py command below):

[build]
compiler=mingw32

Make sure MinGW, Python, and PostgreSQL are in your PATH, e.g., set PATH=C:\MinGW\bin;C:\Python27;C:\Program Files\PostgreSQL\8.4\bin;%PATH%, and then run:

python setup.py build_ext build

With the latest MinGW and unless Python bug 12641 has been resolved, you’ll see the following error message:

cc1.exe: error: unrecognized command line option '-mno-cygwin'

The workaround is to edit the cygwincompiler.py file in your Python Lib\distutils directory and remove all instances of -mno-cygwin. Hopefully after that, the python setup.py above will work and then you can run the following to install it:

python setup.py install

Rinse and repeat for the Python 3.2 version and you should be ready to test connecting from both Python versions to PostgreSQL.

Aside: To remove the pip-installed psycopg2, you run pip uninstall psycopg2 from the corresponding Python environment. To remove win-psycopg, you use Control Panel’s Add or Remove Programs and click Remove on the desired version. To remove the versions installed with MinGW, I’m afraid you’ll have to resort to deleting the Lib\site-packages\psycopg2 directories and the related psycopg2-*.egg-info files.

Testing Python and PostgreSQL on Windows, Part 2

In the previous post, I covered installation of Git, PostgreSQL and Python under Windows in order to set up a Pyrseas testing and development environment. Today, we’ll explore installation of the Python dependencies.

The Hitchhiker’s Guide to Python recommends first downloading and running the distribute_setup.py script. This gives you the easy_install command but the Guide recommends installing pip (with easy_install pip) and then using pip to install all other modules.

You can use pip to install pyyaml with the following command:

pip install pyyaml

However, if you try pip install psycopg2 (or even easy_install psycopg2), it’s very likely you’ll see the error:

error: Unable to find vcvarsall.bat

As best as I’ve been able to determine the only way to get around this is by installing Microsoft Visual Express Studio. According to this email and this post, for Python 2.7, it must be the 2008 Express Studio which, to make things interesting, is no longer available from the download links given. If you search enough you may find it here (download vcsetup.exe) (Update below). After installing VC++ 2008 Express (and if you haven’t installed Strawberry Perl—a later installment in our saga), the pip install psycopg2 command should succeed.

However, if you try to import psycopg2 at the Python 2.7 prompt you may be surprised with a traceback ending in:

    from psycopg2._psycopg import BINARY, NUMBER, STRING, DATETIME, ROWID
ImportError: DLL load failed: The specified module could not be found.

Ahh … the mysteries of Windows DLLs. Don’t despair: this probably means you don’t have the PostgreSQL DLLs (libpq.dll in particular) in your PATH. Add one of the postgres\x.x\bin directories to your PATH and (hopefully) you should then be able to connect from Python 2.7 to your PostgreSQL installations.

OK, let’s turn our attention to Python 3.2. If you followed the Hitchhiker’s Guide instructions previously and added C:\Python27 to your PATH, you’ll now have to change that to C:\Python32. Suggestion: create a couple of batch scripts, e.g., env27.bat and env32.bat, so you can easily switch between the two Python installations. And don’t forget to add the postgres\x.x\bin directory as well.

For 3.2, once again run the distribute_setup.py script, easy_install pip, and pip install pyyaml, as for 2.7 above. Then you can run pip install psycopg2, and if you installed VC++ previously, the gods may smile upon you and you may see the following message:

Successfully installed psycopg2
Cleaning up...

At this point, if you followed along, you’ll have four versions of PostgreSQL (8.4 through 9.2), two versions of Python (2.7 and 3.2), each with PyYAML and psycopg2, ready for testing. If you’re anxious to check things out, invoke one of the PATH setup scripts and try the following, from the Pyrseas source directory:

set PYTHONPATH=%USERPROFILE%\src\Pyrseas
C:\...\src\Pyrseas>python tests\dbobject\test_schema.py
............
----------------------------------------------------------------------
Ran 12 tests in 1.452s

OK

There are some alternatives to installing psycopg2 using pip and VC++ 2008.  I’ll cover those in a subsequent post.

Update: Microsoft seems to keep changing download URLs. Your best bet is to search for “Visual C++ 2008 Express download.” Currently, that should lead you to the following download link.

Testing Python and PostgreSQL on Windows – Basics

In my previous post, I wrote:

Although I have not yet personally run the [Pyrseas] unit tests on Windows …, I believe the tox setup should be quite portable …, since the tests only depend on Python and psycopg2 being able to connect to Postgres, i.e., they do not depend on running any PG utilities from the command line.

Several moons ago, I had done a cursory test of the Pyrseas utilities on Windows from a source zip file, but now I wanted to set up a full development environment (well, almost full—I used Notepad for minor editing) and run through all the unit tests on as many Python/PostgreSQL combinations as possible and ideally using tox.

This post describes what I found out during the install/test process. Hopefully others will find it useful.

Operating System

I chose to use Windows XP Home Edition running under VirtualBox. It’s not a professional solution, but I wasn’t prepared to pay for the “privilege” of using Windows and it’s likely others also have a home edition CD or similar media from an earlier hardware purchase.

Version Control

Pyrseas sources are stored on GitHub. Installing Git and cloning the repository was probably the most uneventful step. The Git download page gives you an installer which offers three options. I chose “Use Git Bash only” as this appears to be the most friendly to someone coming from a Linux/Unix environment. It doesn’t change nor does it require you to change the PATH, all you need to do is select “Git Bash” from the Start menu and a Bash shell is opened for you.

DBMS

Installing PostgreSQL was fairly straightforward. The Windows download page leads to EnterpriseDB one-click installers for multiple platforms and for the more recent versions you have to choose between 32-bit and 64-bit systems. The installer asks for an installation directory, data directory, postgres user password, port number and locale, offering defaults except for the password.

The installer installs both the DBMS and pgAdmin III. If you’re more comfortable with psql, you can select “SQL Shell (psql)” from the Start menu. With either the latter or pgAdmin, you won’t have to change PATH, unless you want to run psql or some other PostgreSQL utility from a Command Prompt window.

Python

Installing Python can be done from Windows MSI installers available from Python.org for the latest releases of Python 2.7 and 3.2. Aside from specifying the installation directory, you’re given the choice of additional components to install, e.g., Tcl/Tk, documentation.

The installers provide a “Python (command line)” option from the Start Menu, but for testing or development, you’ll probably want to open your own Command Prompt window, in order to customize your setup. This requires that you add, e.g., C:\Python27 and C:\Python27\Scripts, to your PATH. Alternatively, you could use the Git Bash window to stay within a Unix-like environment (in which case you’ll still have to add the equivalent directories, e.g., /c/Python27, to PATH).

So far so good. A forthcoming post will cover more, shall we say, entertaining topics.

Testing Python and PostgreSQL on Multiple Platforms

I’m working on making the Pyrseas functional tests portable enough so that they can be submitted to the repository.

Until now, these tests —which exercise dbtoyaml and yamltodb directly— existed as Linux shell scripts. Briefly, each test runs both dbtoyaml and pg_dump -s on a source database creating YAML and SQL dump outputs, respectively. Then it runs yamltodb on a second dabase to recreate the source tables, etc., and finishes by comparing the first pg_dump ouput to that from the target database to verify that all database objects are present and identical.

The Pyrseas unit tests now use tox which makes it fairly easy to add new platforms. For example, on the Python side, the tox.ini configuration includes 2.7 and 3.2, using a single virtualenv for each version.  It would be easy to add 2.6 or 3.3 (when that is released or from a 3.3.0 rc1 install). To test against Postgres 8.4, 9.0, 9.1 and recently 9.2rc1, the only requirement is to define environment variables PG(84|90|91|92)_PORT with the port numbers used for those Postgres installations. Then tox takes care of running the tests eight times, using each Python/Postgres combination.

Although I have not yet personally run the unit tests on Windows or Linux/Unix variants other than Debian, I believe the tox setup should be quite portable (assuming multiple Postgres installations can be present on a given platform), since the tests only depend on Python and psycopg2 being able to connect to Postgres, i.e., they do not depend on running any PG utilities from the command line.

For the functional tests, running the Pyrseas utilities can be done in a fairly portable way thanks to the os.path, tempfile and subprocess modules. Even the diffing of the pg_dump output can be implemented without having to worry about the presence of a diff command, e.g., on Windows.

However, executing pg_dump against multiple Postgres clusters is not so easy. On Debian (and presumably Ubuntu and all other Debian derivatives), if installed from Debian packages, Postgres utilities can be invoked, for example, as

$ pg_dump --cluster 9.1/main -s pyrseas_testdb

The --cluster option causes the correct executable, e.g., /usr/lib/postgresql/9.1/bin/pg_dump to be run and using the correct port. This translates to Python as:

subprocess.call(['pg_dump', '--cluster 9.1/main', '-s',
                targdump, TEST_DBNAME])

The second element in the list could be provided programmatically to run the tests against various Postgres versions, but it would only work on Debian, Ubuntu, etc. (it also assumes the default cluster installation location).

For Red Hat variants, *BSD or Windows, the only solution I could come up with is requiring the existence of a shell script or .bat file with a set name, e.g., pg_dumpXX, where XX is the PG version number, somewhere along the PATH, to point to the right executable. That is not ideal so I’d appreciate hearing from others who may have dealt with similar issues.

PostgreSQL Indexes on Expressions

Pyrseas had its first release a little over a year ago and we now have our first backward compatibility issue. The first release included basic support for traditional indexes, i.e., one or more key columns. For example, given a table test1 with columns col1, col2 and col3, and an index on the last two, dbtoyaml would show (some details omitted):

table test1:
  columns:
  - col1
  - col2
  - col3
  indexes:
    test1_idx:
      columns:
      - col2
      - col3

One of the first issues reported the lack of support for “functional” indexes. I added that but unfortunately, didn’t realize that one can have more than one function or expression and even mix regular columns with expressions. Thus the support was limited to a single expression. Given the first example in the “Indexes on Expressions” documentation, dbtoyaml would show:

table test1:
  columns:
  ...
  indexes:
    test1_lower_col1_idx:
      expression:
        lower(col1)

The original issue was recently re-opened (thanks, Roger) to point out the deficiencies. A fix has been pushed. Thus in the next release, dbtoyaml will support indexes with multiple expressions and even combinations of functions and regular columns. Here is a weird example using the first table. Given CREATE INDEX test1_idx ON test1 (btrim(col3, 'x') NULLS FIRST, col1, lower(col2) DESC), dbtoyaml now outputs:

table test1:
  indexes:
    test1_idx:
      access_method: btree
      keys:
      - btrim(c3, 'x'::text):
          nulls: first
          type: expression
      - col1
      - lower(col2):
          order: desc
          type: expression

So instead of ‘columns’ (or ‘expression’), dbtoyaml outputs ‘keys’. Any key that is an expression is marked with the ‘type’ qualifier.To allow for backward compatibility, yamltodb will continue to accept ‘columns’, so existing YAML specs  with traditional indexes won’t need to be changed. However, if you have an index using an expression, you’ll have to edit as seen above.

Do you have an unusual index?  Try dbtoyaml (from GitHub) on it and let us know if it works (or not).

PostgreSQL Extensions and Pyrseas

Prompted by Peter Eisentraut’s blog post, I’ve finished adding support for PG 9.1 EXTENSIONs to the Pyrseas dbtoyaml and yamltodb utilities. For now, this is only available on GitHub.

In order to deal with procedural languages, which are now created as extensions, the utilities now fetch the pg_catalog schema (previously deemed uninteresting for the purpose of version control).  The output of dbtoyaml from a freshly created 9.1 database (assuming no customizations via template1) is now:

schema pg_catalog:
  description: system catalog schema
  extension plpgsql:
    description: PL/pgSQL procedural language
    version: '1.0'
schema public:
  description: standard public schema

This could be changed easily to exclude pg_catalog (which will now also appear against 8.4 and 9.0 databases) before the next Pyrseas release. Update: The pg_catalog schema will now only be shown if it has something other than a description.

I’m hoping some brave, adventurous or simply interested souls will help test the additions.  Please report any issues on GitHub.

Pyrseas PostgreSQL features: feedback requested

I’ve been considering the missing features of dbtoyaml/yamltodb.  Two of those are PG 9.1 features:  COLLATIONs and EXTENSIONs.  I plan to cover them eventually, but I think I ought to deal first with the remaining pre-9.1 features.

ROLEs (as well as USERs and GROUPs) and TABLESPACEs are not output by pg_dump (the equivalent of dbtoyaml), only by pg_dumpall.  I’m thinking that if I were to add support for ROLEs and TABLESPACEs I’d probably do it with a --cluster option to dbtoyaml, and the output would be something like the following:

database postgres:
  role one:
    createdb: true
    login: true
  role grp:
    roles:
      - one
 tablespace dataspace:
    location: /data/db

This approach could, in theory, produce output for all databases in a cluster, i.e., the databases would be the top nodes in the YAML spec, rather than the schemas as is normally the case. In other words, it would be the equivalent of pg_dumpall --schema-only. However, I suspect that few persons would be interested in that, at least for version control purposes—since different databases may belong to different projects.

On the other hand, I believe DBAs may want dbtoyaml to include “owner” and privlege (GRANT) information. David Fetter specifically asked for GRANTs saying they would be “handy for deployments.”

Owner and privilege information could be shown as follows:

schema public:
  table film:
    owner: jma
    privileges:
      admin:
        - insert
        - update
      jma:
        - all
      viewer:
        - select

An open question is whether some list of roles is necessary, aside from the object-level information.

I’d appreciate readers taking a couple of minutes to leave feedback on any of the above points, particularly on how important they think the features are in their day-to-day work.

Database User Interfaces – Pagination

Since it’s been a while from my last post on this subject, let me recap what we’ve covered:

I also took a detour to explore Python web frameworks. One of the first comments inquired about Tornado. I recently had the opportunity to experiment with it. There’s much to admire, particularly in terms of speed. However, the use of one-or-two-method handler classes essentially for each action (see the blog demo, for example) struck me as counterproductive. It reminded me of Jack Diederich’s “Stop Writing Classes” presentation.

Getting back to database UIs, I have added basic pagination to the listing of films. Here is an example of what it looks like:

The number of lines per page is currently hard-coded (see maxlines in FilmHandler.index), but eventually it should be configurable. The principal additions are the count() and slice() methods in the bl/film.py module. The first does a SELECT COUNT(*) FROM films, so that we can determine how many pages will be needed (the upcoming PostgreSQL 9.2 should improve performance of that query). The slice() method does a SELECT like the all() method but uses LIMIT and OFFSET to retrieve the subset of rows needed for the requested page number.

There are further refinements possible. For example, the list of page numbers/links at the bottom could get very long given enough data in the table. For now, correcting this is left as an exercise for the reader.


1 I’m glad to hear that Kenneth Reitz, Armin Ronacher and others are working on “merging” Werkzeug with Requests.