Sunday, February 17, 2013

Lists and Tuples

I had initially found the divide between lists and tuples in Python confusing. I came from a database background, so I have a certain expectation of what a tuple might be. If you read up on what the difference is in Python, you will find that a) tuples are immutable, and b) singleton tuples use a funny syntax. So just use lists, because it's easier to read and you can't go wrong that way. Oh, and they are both sequences, another overloaded term.

(Yes, there are some details omitted here, such as that since a tuple is immutable, it is hashable and can be used as a dictionary key. But I think that is used fairly seldomly.)

Then I came across Haskell and it dawned on me: Was this just a poorly mangled feature from Haskell? I don't know the history, but it looks a bit like it. You see, Haskell also has list and tuples. Lists are delimited by square brackets, and tuples are delimited by parentheses:

let alist = [1, 2, 3]
let atuple = (1, 2, 3)
(Technically, in Python, tuples are not delimited by parentheses, but they often appear that way.) But the difference is that Haskell does not use parentheses for any other purpose, such as delimiting function arguments. It uses spaces for that. (So it looks more like a shell script at times.)
Python: len([1, 2, 3])
Haskell: length [1, 2, 3]
But in Haskell, tuples are not mutable lists and lists are not mutable tuples. Tuples and lists are quite different but complementary things. A list can only contain elements of the same type. So you can have lists
[1, 2, 3, 4, 5]
["a", "b", "c", "d"]
but not
[1, 2, "a"]
A tuple, on the other hand, can contain values of different types
(1, 2, "a")
(3, 4, "b")
A particular type combination in a tuple creates a new type on the fly, which becomes interesting when you embed tuples in a list. So you can have a list
[(1, 2, "a"), (3, 4, "b")]
but not
[(1, 2, "a"), (3, 4, 5)]
Because Haskell is statically typed, it can verify this at compile time.

If you think in terms of relational databases, the term tuple in particular makes a lot of sense in this way. A result set from a database query would be a list of tuples.

The arrival of the namedtuple also supports the notion that tuples should be thought of as combining several pieces of data of different natures, but of course this is not enforced in either tuples or named tuples.

Now, none of this is relevant to Python. Because of duck typing, a database query result set might as well be a list of lists or a tuple of tuples or something different altogether that emulates sequences. But I found it useful to understand where this syntax and terminology might have come from.

Looking at the newer classes set and frozenset, it might also help to sometimes think of a tuple as a frozenlist instead, because this is closer to the role it plays in Python.

Thursday, February 14, 2013

pgindent Jenkins job

I have set up a Jenkins job that runs pgindent. Besides checking that the procedure of running pgindent works, it also provides a preview of what pgindent would do with the current source (pgindent.diff), which can be educational or terribly confusing.

Friday, February 1, 2013

Introducing the Pex package manager for PostgreSQL

I have written a new light-weight package manager for PostgreSQL, called pex. It's targeted at developers, allows easy customization, and supports multiple PostgreSQL installations.

Here is how it works:

Installation:

git clone git://github.com/petere/pex.git
cd pex
sudo make install

Install some packages:

pex init
pex install plproxy
pex search hash
pex install pghashlib

Multiple PostgreSQL installations:

pex -g /usr/local/pgsql2 install plproxy
pex -p 5433 install pghashlib

Upgrade:

pex update
pex upgrade

It works a bit like Homebrew, except that it doesn't use Ruby or a lot of metaphors. ;-)

Check it out at https://github.com/petere/pex.