Tuesday, March 6, 2012

PostgreSQL make install times

I have decided that make install is too slow for me. Compare: A run of make install takes about 10 seconds (details below), but a run of make all with the tree mostly up to date and using ccache for the rest usually takes about 1 or 2 seconds. You can end up wasting a lot of time if you need to do many of these build and install cycles during development. In particular, make check includes a run of make install, so all this time is added to the time it takes for tests to complete.

So let's optimize this. The times below are all medians from 5 consecutive runs, writing over an existing installation, so they all had to do the same amount of work.

This is the baseline:

  • make install — 10.493 s

The first change is to use a faster shell. This system is using bash as /bin/sh. Many Linux distributions now use dash instead, but for some reason I haven't changed this system during the upgrade.

  • make install SHELL=/bin/dash — 6.344 s
I guess I'll be switching this system soon as well then!

The next thing is to avoid installing the translation files. This exploded the number of files that need to be installed. Instead of, say, one program file, you end up installing one program file and a dozen or so translation files.

  • make install SHELL=/bin/bash enable_nls=no — 6.890 s
  • make install SHELL=/bin/dash enable_nls=no — 4.482 s
(In practice you would use configure --disable-nls, which is the default. The above is just a way to do this without reconfiguring.) Now I have in the past preferred to build with NLS support to be able to catch errors in that area, but considering this improvement and the availability of the make maintainer-check target, I might end up building without it more often.

Another tip I remembered from the past was to use the make -s option to avoid screen output. Depending on the operating system and whether you are logged in locally or remotely, this can be a big win. On my system, this got lost in the noise a bit, but it appeared to make a small difference over many runs.

  • make install SHELL=/bin/bash -s — 10.511 s
  • make install SHELL=/bin/dash -s — 6.146 s
Do add this to your arsenal anyway if you want to get maximum performance.

Next, let's replace the install-sh script that does the actual file copying. For obscure reasons, PostgreSQL always uses that shell script, instead of the /usr/bin/install program that an Autoconf-based build system would normally use. But you can override the make variables and sustitute the program you want:

  • make install SHELL=/bin/bash INSTALL=install — 5.418 s
  • make install SHELL=/bin/dash INSTALL=install — 3.995 s
Interestingly, the choice of shell still makes a noticeable difference, even though it's no longer used to execute install-sh.

Finally, you can also use parallel make for the installation step:

  • make install SHELL=/bin/bash -j2 — 6.538 s
  • make install SHELL=/bin/dash -j2 — 4.158 s
You can gather from these numbers that the installation process appears to be mostly CPU-bound. This system has 4 cores, so let's add some more parallelization:
  • make install SHELL=/bin/dash -j3 — 3.330 s
  • make install SHELL=/bin/dash -j4 — 2.944 s
  • make install SHELL=/bin/dash -j5 — 2.930 s
  • make install SHELL=/bin/dash -j6 — 2.952 s
That's probably enough.

Now let's put everything together:

  • make install SHELL=/bin/dash enable_nls=no INSTALL=install -s -j4 — 1.708 s
Or even:
  • make install SHELL=/bin/dash enable_nls=no INSTALL=install -s -j3 — 1.654 s
That's a very nice improvement from 10.493 s!

The problem is, it is not all that easy to pass these options to the make install calls made in make check runs. If you can and want to change your system shell, and you configure without NLS support, then you will probably already be more than half way there. Then again, I suspect most readers already have that setup anyway. For the other options, to take down the installation time to almost instantaneous, you have to do ad hoc surgery in various places. I'm looking into improving that.

5 comments:

  1. very nice, tyvm

    ReplyDelete
  2. That's a very cool post. Since I do install pg quite often, I will definitely use your ideas. Thanks.

    ReplyDelete
  3. > For obscure reasons, PostgreSQL always uses that shell script, instead of the /usr/bin/install program that an Autoconf-based build system would normally use.

    Unfortunately your commit in 2001 (a3176dac22c) gives zero details as to this "obscure reason", I was hoping your blog post would do better than that. Seems like a hack that has clearly outlived its usefulness.

    ReplyDelete
    Replies
    1. The reasons are obscure, after all, and not directly relevant to this post. But I plan to recover them.

      Delete
  4. Good analysis. My "make -j2 install" on Debian 6 is 1.8 seconds, so I was confused why it was so fast. Turns out /bin/sh is already dash as a default on this operating system. I now remember having to switch a few scripts to use /bin/bash on this system.

    ReplyDelete