Wednesday, June 30, 2010

An Update On Views

A long time ago, in a text book far away, someone described the following (relational) database design approach: Put all your data in tables, normalized and so on, as you'd expect.  An top of each table, create a view.  The database clients only access the views.  As the application evolves, tables might change, more views will be added, but the interface that the existing views present to the outside stays the same.  Sounds nice; everything stays backward compatible.

In order for this to work, views need to be updatable.  Updatable views have been standardized in SQL as far back as 1992, and have been implemented (partially, possibly) in some of the popular/successful SQL implementations.  Curiously, however, this feature has evaded PostgreSQL altogether so far.  Conventional wisdom has held it that views can be made updatable by hand using the rewrite rules system, and a proper implementation of automatically updatable views would only need to create the rules automatically.  Debate and coding around this idea has been taking place for many years, mostly inspired by the heroic efforts of Bernd Helmle.  But I think the conclusion of those in the know by now is that the rewrite rules system is inappropriate for this task (and in fact others have argued that the rewrite rules system is inappropriate for any task and should be ripped out; perhaps harsh but not far from the truth in my opinion).  So the updatable views effort is at square one (or zero) at the moment.

In the meantime, updatable views have been conspicuously absent from any of the recent top-needed features list for PostgreSQL.  Which basically means no one really cares (anymore).  Why is that?

My observation is that the typical use of views has changed quite a bit of the years (decades), and has decreased altogether.  The idea of using views as transparent interfaces on top of tables never really took off to begin with.  Using views to encapsulate complex queries is still reasonable, but not terribly common.  Part of the reason is that many queries are generated, either by ORMs or by custom code.  The other reason is that server-side functions provide a much more powerful abstraction mechanism.  They can't do everything, in particular inlining of queries doesn't alway work, but they can do a lot more with parameters and general procedural programming.  Views are also used for access control, either for columns or for rows.  Column security can now be had with explicit column privileges in PostgreSQL, and one could argue that using views for this was a workaround all along.  And using views for row access control doesn't actually really work, as it turns out.  Proper row security might be an upcoming feature, but it will need to work differently.  And anyway, superfinely granular access control is rather rare in SQL databases overall, I guess.  And in all of these cases, the views don't need to be updatable.

Views are occasionally used as a thin wrapper around a table-returning function, so the function can be access like a table.  But this only works when reading.  Another use, somewhat related actually, is defining views for monitoring the database system or the database application.  Because those views are often used interactively, it is
important that they are convenient to access, and so hugely complex queries combined with custom functions or possibly query generation are inappropriate.

On the other hand, materialized views are all the rage now, appearing at the top of the requested PostgreSQL feature list, and having been fixtures in the feature lists of other SQL implementations for some time.  While traditional views are a tool for organizing code and interfaces, materialized views are purely a performance optimization tool (which, by the way, is the reason materialized views are not
specified in the SQL standard).  Depending on how you implement materialized views, they might even be usable implicitly, if the query matches the materialized query.  Then, materialized views wouldn't really be views at all anymore, they're just query caches.

So, what do you think?  Are traditional views an obsolescent, marginal feature?  Is the matter of updatable views obsolete?