Thursday, November 4, 2010

pipefail

It is widely considered good style to include
set -e
near the beginning of a shell script so that it aborts when there is an uncaught error. The Debian policy also recommends this.

Unfortunately, this doesn't work in pipelines. So if you have something like
some_command | sed '...'
a failure of some_command won't be recognized.

By default, the return status of a pipeline is the return status of the last command. So that would be the sed command above, which is usually not the failure candidate you're worried about. Also, the definition of set -e is to exit immediately if the return status of the last pipeline is nonzero, so the exit status of some_command isn't considered there.

Fortunately, there is a straightforward solution, which might not be very well known. Use
set -o pipefail
With pipefail, the return status of a pipeline is "the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully". So if some_command fails, the whole pipeline fails, and set -e kicks in. So you need to use set -o pipefail and set -e together to get this effect.

This only works in bash, so if you're trying to write scripts that conform to POSIX or some other standard, you can't use it. (There are usually other ways to discover failures in pipelines in other shells, but none are as simple as this one, it appears.) But if you are writing bash anyway, you should definitely use it. And if you're not using bash but use a lot of pipelines in your scripts, you should perhaps consider using bash.

(Hmm, it looks like there could be a number of latent bugs in the existing Debian package maintainer scripts, because this issue appears to be widely ignored.)

Monday, November 1, 2010

Git User's Survey 2010 Results

The results of the Git User's Survey 2010 are up.

Not many surprises, but I can see how this sort of survey is very useful for the developers of Git.