Wednesday, August 28, 2013

Automating patch review

I think there are two kinds of software development organizations (commercial or open source):

  1. Those who don’t do code review.

  2. Those who are struggling to keep up with code review.

PostgreSQL is firmly in the second category. We never finish commit fests on time, and lack of reviewer resources is frequently mentioned as one of the main reasons.

One way to address this problem is to recruit more reviewer resources. That has been tried; it’s difficult. The other way is to reduce the required reviewer resources. We can do this by automating things a little bit.

So I came up with a bag of tools that does the following:

  1. Extract the patches from the commit fest into Git.

  2. Run those patches through an automated test suite.

The first part is done by my script commitfest_branches. It extracts the email message ID for the latest patch version of each commit fest submission (either from the database or the RSS feed). From the message ID, it downloads the raw email message and extracts the actual patch file. Then that patch is applied to the Git repository in a separate branch. This might fail, in which case I report that back. At the end, I have a Git repository with one branch per commit fest patch submission. A copy of that Git repository is made available here: https://github.com/petere/postgresql-commitfest.

The second part is done by my Jenkins instance, which I have written about before. It runs the same job as it runs with the normal Git master branch, but over all the branches created for the commit fest. At the end, you get a build report for each commit fest submission. See the results here: http://pgci.eisentraut.org/jenkins/view/PostgreSQL/job/postgresql_commitfest_world/. You’ll see that a number of patches had issues. Most were compiler warnings, a few had documentation build issues, a couple had genuine build failures. Several (older) patches failed to apply. Those don’t show up in Jenkins at all.

This is not tied to Jenkins, however. You can run any other build automation against that Git repository, too, of course.

There is still some manual steps required. In particular, commitfest_branches needs to be run and the build reports need to be reported back manually. Fiddling with all those branches is error-prone. But overall, this is much less work than manually downloading and building all the patch submissions.

My goal is that by the time a reviewer gets to a patch, it is ensured that the patch applies, builds, and passes the tests. Then the reviewer can concentrate on validating the purpose of the patch and checking the architectural decisions.

What needs to happen next:

  • I’d like an easier way to post feedback. Given a message ID for the original patch submission, I need to fire off a reply email that properly attaches to the original thread. I don’t have an easy way to do that.

  • Those reply emails would then need to be registered in the commit fest application. Too much work.

  • There is another component to this work flow that I have not finalized: checking regularly whether the patches still apply against the master branch.

  • More automated tests need to be added. This is well understood and a much bigger problem.

In the meantime, I hope this is going to be useful. Let me know if you have suggestions, or send me pull requests on GitHub.

4 comments:

  1. This seem to be very similar to what was done per http://www.postgresql.org/message-id/CAKwe89BiDYQ6r__7NamviUf+N_0Zr=0HdSHxnkx41CNwGzSO1g@mail.gmail.com - it would probably be a good idea to coordinate efforts.

    FWIW, on your next steps:

    1. There will likely be an API for that in the new CF app (provided people like that app and it gets deployed, of course).

    2. There will definitely be an API for that in the new CF app (see 1.)

    3. Yeah, there's likely to be at least a partial API for that...

    4. That one is a different scope of course :)

    ReplyDelete
  2. Have you looked at something like Gerrit + Gerrit plugin for Jenkins to build/test each patch?

    (it would mean using a different Git repo though).

    ReplyDelete
    Replies
    1. I use that at work, and it partially inspired this setup, but it requires that everyone use Gerrit or I make fake pushes into Gerrit, which is more complexity that we need right now, I think. It might be something to try in the future.

      Delete
  3. I'd argue that we haven't tried to recruit additional reviewer resources. Not really.

    Currently the committers want people to do free review, but give them no credit, respect, or thanks. Oh, and terrible, incomprehensible, tools. Unsurprisingly, we get no reviewers.

    ReplyDelete