Tuesday, April 1, 2014

PostgreSQL trash can

The PostgreSQL trash can is a PostgreSQL plugin that implements a trash can/wastebasket/rubbish bin/recycling container. You drop a table and it's not really gone but only moved to the trash. This allows desktop-minded users to drop tables willy-nilly while giving them the warm and fuzzy feeling of knowing that their data is still there (while giving administrators the cold and, uh, unfuzzy feeling of knowing that disk space will never really be freed again). Now they only need to think of "vacuum" as "disk defragmentation", and they'll feel right at home.

Thursday, September 26, 2013

Design by committee

Design by committee is usually a term of abuse, but sometimes it's perhaps not the worst alternative. At the opposite end of the spectrum, there is design by disconnected individuals. That is how you get
ALTER TABLE tbl OWNER TO something
but
ALTER TABLE tbl SET SCHEMA something
in PostgreSQL.

Maybe a committee faced with this inconsistency would arrive at the compromise

ALTER TABLE tbl [SET] {OWNER|SCHEMA} [TO] something
?

Thursday, August 29, 2013

Testing PostgreSQL extensions on Travis CI revisited

My previous attempt to setup up multiple-PostgreSQL-version testing on Travis CI worked OK, but didn't actually make good use of the features of Travis CI. So I stole, er, adapted an idea from clkao/plv8js, which uses an environment variable matrix to control which version to use. This makes things much easier to manage and actually fires off parallel builds, so it's also faster. I've added this to all my repositories for PostgreSQL extensions now. (See some examples: pglibuuid, plxslt, pgvihash, pgpcre, plsh)

Wednesday, August 28, 2013

Automating patch review

I think there are two kinds of software development organizations (commercial or open source):

  1. Those who don’t do code review.

  2. Those who are struggling to keep up with code review.

PostgreSQL is firmly in the second category. We never finish commit fests on time, and lack of reviewer resources is frequently mentioned as one of the main reasons.

One way to address this problem is to recruit more reviewer resources. That has been tried; it’s difficult. The other way is to reduce the required reviewer resources. We can do this by automating things a little bit.

So I came up with a bag of tools that does the following:

  1. Extract the patches from the commit fest into Git.

  2. Run those patches through an automated test suite.

The first part is done by my script commitfest_branches. It extracts the email message ID for the latest patch version of each commit fest submission (either from the database or the RSS feed). From the message ID, it downloads the raw email message and extracts the actual patch file. Then that patch is applied to the Git repository in a separate branch. This might fail, in which case I report that back. At the end, I have a Git repository with one branch per commit fest patch submission. A copy of that Git repository is made available here: https://github.com/petere/postgresql-commitfest.

The second part is done by my Jenkins instance, which I have written about before. It runs the same job as it runs with the normal Git master branch, but over all the branches created for the commit fest. At the end, you get a build report for each commit fest submission. See the results here: http://pgci.eisentraut.org/jenkins/view/PostgreSQL/job/postgresql_commitfest_world/. You’ll see that a number of patches had issues. Most were compiler warnings, a few had documentation build issues, a couple had genuine build failures. Several (older) patches failed to apply. Those don’t show up in Jenkins at all.

This is not tied to Jenkins, however. You can run any other build automation against that Git repository, too, of course.

There is still some manual steps required. In particular, commitfest_branches needs to be run and the build reports need to be reported back manually. Fiddling with all those branches is error-prone. But overall, this is much less work than manually downloading and building all the patch submissions.

My goal is that by the time a reviewer gets to a patch, it is ensured that the patch applies, builds, and passes the tests. Then the reviewer can concentrate on validating the purpose of the patch and checking the architectural decisions.

What needs to happen next:

  • I’d like an easier way to post feedback. Given a message ID for the original patch submission, I need to fire off a reply email that properly attaches to the original thread. I don’t have an easy way to do that.

  • Those reply emails would then need to be registered in the commit fest application. Too much work.

  • There is another component to this work flow that I have not finalized: checking regularly whether the patches still apply against the master branch.

  • More automated tests need to be added. This is well understood and a much bigger problem.

In the meantime, I hope this is going to be useful. Let me know if you have suggestions, or send me pull requests on GitHub.

Thursday, July 18, 2013

Testing PostgreSQL extensions on Travis CI

I have cobbled together some scripts to be able to test PostgreSQL extensions against multiple PostgreSQL major versions on Travis CI. (This requires that the extension is hosted on GitHub.) See the configuration for PL/sh and the build output as examples. Perhaps others will find this useful for their extensions as well.

Wednesday, July 17, 2013

Tricky shell local variables

I have a word of warning against improper use of local in shell functions.

If you are using shell functions, you might want to declare some variables local to the shell function. That is good. The basic syntax for that is

local a b c

In some shells, you can also combine the local declaration and assignment, like this:

local foo=$1
local bar=$2

(The Debian policy even explicitly allows it.)

This is somewhat dangerous.

Bare shell assignment like

foo=$bar

does not perform word splitting, so the above is safe even if there are spaces in $bar. But the local command does perform word splitting (because it can take multiple arguments, as in the first example), so the seemingly similar

local foo=$bar

is not safe.

This can be really confusing when you add local to existing code and it starts breaking.

You can avoid this, of course, by always quoting everything to like

local foo="$bar"

but overquoting isn't always desirable, because it can make code less readable when commands are nested, like

local foo="$(otherfunc "other arg")"

(Nesting is legal and works fine in this case, however.)

I suggest using local only for declaring variables, and using separate assignment statements. That way, all assignments are parsed in the same way.

Wednesday, June 19, 2013

Autopex: PostgreSQL extension installation magic

Autopex is the brainchild of a long night at the Royal Oak. It ties together Pex and event triggers to magically download and build any extension that you install. So after you have set everything up you can do, say, CREATE EXTENSION plproxy, and it will transparently download and build plproxy for you. (Actually, this only works if the extension name is the same as the package name. I'm planning to fix that.)

Note 1: You can't install Autopex via Pex, yet.

Note 2: I guess the next logical step would be Autoautopex, which installs Autopex and Pex automatically somehow. Patches welcome.

I suppose with logical replication, this might actually end up installing the extension code on the replication slaves as well. That would be pretty neat.