Friday, May 28, 2010

System-Versioned Tables

After my report on the upcoming SQL:2011, some people had asked me about the system-versioned table feature that is going to be the arguably only major new feature there. Here is how it works:
CREATE TABLE tab (
    useful_data int,
    more_data varchar,
    start timestamp GENERATED ALWAYS AS SYSTEM VERSION START,
    end timestamp GENERATED ALWAYS AS SYSTEM VERSION END
) WITH SYSTEM VERSIONING;
(This hilariously verbose syntax arises because this is defined so that it fits into the more general generated columns feature, e. g., GENERATED ALWAYS AS IDENTITY, similar to PostgreSQL's serial type.)
INSERT INTO tab (useful_data, more_data) VALUES (...);
This sets the "start" column to the current transaction timestamp, and the "end" column to the highest possible timestamp value.
UPDATE tab SET useful_data = something WHERE more_data = whatever;
For each row that would normally be updated, set the "end" timestamp to the current transaction timestamp, and insert a new row with the "start" timestamp set to the current transaction timestamp. DELETE works analogously.
SELECT * FROM tab;
This only shows rows where current_timestamp is between "start" and "end". To show the non-current data, the following options are
possible:
SELECT * FROM tab AS OF SYSTEM TIME timestamp;
SELECT * FROM tab VERSIONS BEFORE SYSTEM TIME timestamp;
SELECT * FROM tab VERSIONS AFTER SYSTEM TIME timestamp;
SELECT * FROM tab VERSIONS BETWEEN SYSTEM TIME timestamp AND timestamp;
There's also the option of
CREATE TABLE tab ( ... ) WITH SYSTEM VERSIONING KEEP VERSIONS FOR interval;
to automatically delete old versions.

That's more or less it. It's pretty much xmin/xmax/vacuum on a higher level with timestamps instead of numbers. And it's a revival of the old time travel feature. Obviously, you can do most or all of this with triggers already.

Wednesday, May 19, 2010

PostgreSQL Package Management

Database development sucks, and I would like to fix it. I don't mean database system software development; that's awesome. ;-) I mean database development as in writing the tables, views, functions, and other code that make up your database. We have come further in recent years in PostgreSQL land. We have had the PL/pgSQL debugger, there is Piggly for PL/pgSQL code coverage, we have pgTAP for unit testing support, we have in Post Facto a version control system running inside the database, although I'll try to explain below why I think that that is not the right solution. My problem is that getting database code from the editor to the database server in a safe manner is pretty difficult. This already starts with deploying simple database code to a single server for the first time (as shown in this entry), and gradually gets more complicated when you need to update existing installations, manage multiple servers, or even multiple versions of that code.

My answer to that problem is an old friend: package management. Package managers such as dpkg and rpm are pretty well-established solutions and have shown over the years that managing software deployments can be easy and occasionally even fun.

Consider how the development and deployment of normal software proceeds:
  • You edit some files in your text editor.
  • You execute some kind of build step (could be compilation or more trivial).
  • Maybe you run a test suite.
  • You check your files into version control.
And then you or someone else, perhaps a release manager or system administratior does something like this:
  • Tag a release.
  • Export the tag from version control.
  • Build the software.
  • Run tests.
  • Build a package.
  • Push package to repository.
  • Install/upgrade package on deployment machines.
There are of course sites that do this differently. One could, for example, just copy the built files onto the target machines. But if you do that, then you might as well be happy with the current state of SQL deployment. Many sites are going for continuous deployment nowadays, in which case you might not need a way to track what version of the code is currently running, but you will still want a way to make deployments and upgrades in a safe manner. My premise is that the development cycle of database software should work in pretty much the same way as for other software. But we are only slowly getting the tools to support this work flow. We have editors that allow editing SQL, of course. We don't really need any new build tools; the existing ones appear to serve just fine. We have testing support, getting better all the time. At this point it was once thought that perhaps a database-based version control system such as Post Facto would address the remaining issues. After considering that for a while I have figured out that it doesn't. The main problems with that approach are:
  • Abuses version control system as a deployment tool. Some people deploy code by svn checkout, but most don't.
  • Does not allow for a build step (in a straightforward manner). The database is the source code. You can't generate your database from some XML source, for example.
  • I consider it preferable to put database code and related non-database code in the same version control system, so they can be developed, browsed, deployed, etc. together.
A database-based version control system might actually be the thing for some people, just like other version control systems serve different needs. (Except, note that mine is better than yours! ;-) )

So I think what the world needs is a package manager for databases, that manages database objects just like a package manager for operating systems manages files. And it should work in much the same ways, like this:

  • Conceptually, there are two major modes of operation: building a package and installing a package (think rpmbuild and rpm, or dpkg-buildpackage and dpkg).
  • Building a package is controlled by a special instruction file (think RPM spec file or debian/ directory).
  • Building a package involves running that software's normal build routine and installing it into a temporary staging area. In simple cases, this is running psql -f somefile.sql, but it could be anything. You could have an ORM tool generate the database.
  • The package file is an archive file containing the contents of that staging area plus some metadata.
  • Installing a package unpacks the archive file and places the contents into the designated destination, while sensibly dealing with existing installations.
There are additional features of well-known package managers that could be of interest in a database environment: pre/post scripts for additional setup, relocations (perhaps for selecting installation schemas), integration with a repository manager (like yum or apt).

So I have set out to write such a tool and have arrived at a small prototype. Here is how it works:

$ pgpkg build
 pgpgk/rules clean
make: Nothing to be done for `clean'.
 pgpgk/rules build
make: Nothing to be done for `build'.
 pgpgk/rules binary
psql -X -q -f install_foo.sql
pgpkg create
wrote ../foo_1.0.pgpkg
This build process is currently driven by a Debian-style rules and control file. An RPM-style spec file might also be possible.
$ pgpkg contents ../foo_1.0.pgpkg
{'Client-Encoding': 'UTF8', 'Version': 1.0, 'Format': '0.0', 'Package': 'foo'}
CREATE FUNCTION "public"."test1"(integer) RETURNS integer LANGUAGE "sql" VOLATILE AS $$select 1$$;
CREATE FUNCTION "public"."test2"(integer) RETURNS integer LANGUAGE "sql" VOLATILE AS $$select 2$$;
CREATE FUNCTION "public"."test3"(integer, integer) RETURNS integer LANGUAGE "sql" VOLATILE AS $$select 3$$;

$ pgpkg install -d 'dbname=bar' ../foo_1.0.pgpkg

$ pgpkg remove -d 'dbname=bar' foo
There are more features planned, such as diffing packages, but I haven't gotten to those yet. Also, don't get too excited yet, it only handles a very small subset of SQL objects. Much more code needs to be added to handle everything, but that will mostly be a typing exercise (famous last words).

Ah, but what about this extension packaging business that has been discussed for the last couple of years? Well, I have been staring at the various design documents for that same couple of years now, and it's not going anywhere and it's not really set out to solve my problem. Partially by coincidence, partially by intent, as I'm posting this the guys are sitting in Ottawa discussing extension packaging yet again, so at least take this as a challenge to get something going. :-)

Note that I am calling this thing a "package", because that's what it is. It's not (necessarily) an "extension" (although an extension could be packaged as a package), and it's not a "module" (that term is reserved by SQL; what Oracle calls a package is actually a module).

Also note that this is implemented as an external program, not in the database server. I think the operating system shell is a much more usable and versatile environment. Think of all the things you might want to do: build many packages, compare packages, install some/many/all packages, grep package contents, copy, download, or upload packages. This makes most sense from the shell. If you think about it, neither dpkg nor rpm require extensions in the kernel or the file system to do their work; the same applies here. Another advantage is that the implementation can evolve faster and doesn't need to wait for say PostgreSQL 9.1 to be out.

Anyway, for the moment, forget about extensions. Think about the database application code that you need to move around. Does the above approach make sense? Could it be useful? Ideas, suggestions?

Tuesday, May 18, 2010

Visual Explain Reloaded

One of the new features in PostgreSQL 9.0 is EXPLAIN output in different markup formats (XML, JSON, YAML). The idea is that this makes it easier for programs (as opposed to humans) to parse the output.

Using one of the standard teaching examples, compare the outputs:
regression=> EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Hash Join  (cost=227.95..700.46 rows=101 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  Seq Scan on tenk2 t2  (cost=0.00..434.00 rows=10000 width=244)
   ->  Hash  (cost=226.68..226.68 rows=101 width=244)
         ->  Bitmap Heap Scan on tenk1 t1  (cost=5.03..226.68 rows=101 width=244)
               Recheck Cond: (unique1 < 100)
               ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.01 rows=101 width=0)
                     Index Cond: (unique1 < 100)
(8 rows)
regression=> EXPLAIN (FORMAT JSON) SELECT * FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

                     QUERY PLAN
-----------------------------------------------------
 [                                                  +
   {                                                +
     "Plan": {                                      +
       "Node Type": "Hash Join",                    +
       "Join Type": "Inner",                        +
       "Startup Cost": 227.95,                      +
       "Total Cost": 700.46,                        +
       "Plan Rows": 101,                            +
       "Plan Width": 488,                           +
       "Hash Cond": "(t2.unique2 = t1.unique2)",    +
       "Plans": [                                   +
         {                                          +
           "Node Type": "Seq Scan",                 +
           "Parent Relationship": "Outer",          +
           "Relation Name": "tenk2",                +
           "Alias": "t2",                           +
           "Startup Cost": 0.00,                    +
           "Total Cost": 434.00,                    +
           "Plan Rows": 10000,                      +
           "Plan Width": 244                        +
         },                                         +
         {                                          +
           "Node Type": "Hash",                     +
           "Parent Relationship": "Inner",          +
           "Startup Cost": 226.68,                  +
           "Total Cost": 226.68,                    +
           "Plan Rows": 101,                        +
           "Plan Width": 244,                       +
           "Plans": [                               +
             {                                      +
               "Node Type": "Bitmap Heap Scan",     +
               "Parent Relationship": "Outer",      +
               "Relation Name": "tenk1",            +
               "Alias": "t1",                       +
               "Startup Cost": 5.03,                +
               "Total Cost": 226.68,                +
               "Plan Rows": 101,                    +
               "Plan Width": 244,                   +
               "Recheck Cond": "(unique1 < 100)",   +
               "Plans": [                           +
                 {                                  +
                   "Node Type": "Bitmap Index Scan",+
                   "Parent Relationship": "Outer",  +
                   "Index Name": "tenk1_unique1",   +
                   "Startup Cost": 0.00,            +
                   "Total Cost": 5.01,              +
                   "Plan Rows": 101,                +
                   "Plan Width": 0,                 +
                   "Index Cond": "(unique1 < 100)"  +
                 }                                  +
               ]                                    +
             }                                      +
           ]                                        +
         }                                          +
       ]                                            +
     }                                              +
   }                                                +
 ]
(1 row)
For just reading the plan, the new formats are probably not better, although some might prefer them. The real win comes when you can feed this to a program to create a visualization. With the old format, parsing the output was complicated and error prone. With the new formats, it is easy. In fact, it was so easy that I couldn't resist writing a small visual explain program that renders plans through the graphviz library. Here is how you can use it:
regression=> \a\t
regression=> EXPLAIN (FORMAT JSON) SELECT * FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 \g |veung
Result:

Simple and useful. Get it from GitHub.

Wednesday, May 12, 2010

MERGE Syntax

The SQL MERGE statement has gotten my attention again. For many years, PostgreSQL users have been longing for a way to do an "upsert" operation, meaning do an UPDATE, and if no record was found do an INSERT (or the other way around). Especially MySQL users are familiar with the REPLACE statement and the INSERT ... ON DUPLICATE KEY UPDATE statement, which are two variant ways to attempt to solve that problem (that have interesting issues of their own). Of course, you can achieve this in PostgreSQL with some programming, but the solutions tend to be specific to the situation, and they tend to be lengthier than one would want.

Discussions on this then usually proceed to speculate that the SQL-standard MERGE statement ought to be the proper way to solve this, and then it turns out that no one completely understands the MERGE syntax or semantics, especially as they apply to this upsert problem. (I was in that group.) And that's been the end of that so far. OK, as I write this I am pointed, via Robert Haas's blog post, to an older mailing list post by Simon Riggs, who is surely one of the men most qualified to drive an eventual implementation, that contains a hint toward the solution, but it's hard to find in that post, if you want to try.

This subject had gotten my attention again at the SQL standard working group meeting I attended a few weeks ago, where I learned that in SQL:2011, a DELETE branch has been added to MERGE. We also took some time after the official part of the meeting to work through some examples that illustrate the uses of the MERGE statement.

Let's take a look at what the MERGE statement is originally supposed to do, and where the term "merge" arose from. Let's say you have a table with outstanding balances, such as
CREATE TABLE balances (
    name text,
    balance numeric
);
and at intervals you get a list of payments that your organization has received, such as
CREATE TABLE payments (
    name text,
    payment numeric
);
What you want to do then is to "merge" the payments table into the balances table in the following way:
  • If a balance exists, subtract from it.
  • If the balance goes to zero, delete the entire record.
  • If no balance exists, create a record (maybe someone pre-paid).
The command to do this would be:
MERGE INTO balances AS b
    USING payments AS p
    ON p.name = b.name
    WHEN MATCHED AND b.balance - p.payment = 0 THEN DELETE
    WHEN MATCHED AND b.balance - p.payment <> 0 THEN UPDATE SET balance = balance - p.payment
    WHEN NOT MATCHED THEN INSERT (name, balance) VALUES (p.name, -b.payment);
Of course there are simpler cases, but this shows all of the interesting features of this command.

How does this get us upsert? There, you don't have two tables, but only one and some values. I have seen some claims and examples about this in the wild that turn out to be wrong because they evidently violate the syntax rules of the SQL standard. So I did the only sensible thing and implemented the MERGE syntax into the PostgreSQL parser on the flight back, because that seemed to be the best way to verify the syntax. So the correct way, I believe, to do, say, an upsert of the balances table would be:

MERGE INTO balances AS b
    USING (VALUES ('foo', 10.00), ('bar', 20.00)) AS p (name, payment)
    ON p.name = b.name
    WHEN MATCHED AND b.balance - p.payment = 0 THEN DELETE
    WHEN MATCHED AND b.balance - p.payment <> 0 THEN UPDATE SET balance = balance - p.payment
    WHEN NOT MATCHED THEN INSERT (name, balance) VALUES (p.name, -b.payment);
Not all that nice and compact, but that's how it works.

Note that the AS clause after VALUES is required. If you leave it off, the PostgreSQL parser complains that a subquery in FROM needs an AS clause. Which is obviously not what this is, but it uses the same grammar rules, and it makes sense in this case because you need a correlation name to join against. And it was also one of those rare moments when you implemented something that gives you correct feedback that you didn't even provide for.

Anyway, the examples above all parse correctly, but they don't do anything yet. But if someone wants to implement this further or just try out the syntax, I'll send my code.

Friday, May 7, 2010

Update: PostgreSQL doesn't really need a new Python driver

A couple of months ago, there was a brief stir in the PostgreSQL community about the state of the Python drivers and there being too many of them. While we can't really do much anymore about there being too many, the most popular driver, Psycopg, has since had a revival:
  • A new release 2.0.14 has been issued.
  • The license has been changed to GNU LGPL 3, alleviating various concerns about the previous "hacked up" license.
Of course, this doesn't yet fix every single bug that has been mentioned, but it ought to be the foundation for a viable future. The mailing list has been quite active lately, and the Git repository has seen a bunch of commits, by multiple developers. So if you have had issues with Psycopg, get involved now. And thanks to all who have made this "revival" happen.