Sunday, January 3, 2010

Missing Features for PostgreSQL SQL Conformance

A thought to start the new year: Perhaps it's time for the final push to complete the core SQL conformance for PostgreSQL.

Where do we stand? The PostgreSQL documentation lists in its appendix the currently supported and unsupported SQL features. As explained there, a certain subset of these features represents the "Core" features, which every conforming SQL implementation must supply, while the rest is purely optional. The unsupported features page currently lists 14 remaining Core features and subfeatures that are missing from PostgreSQL. Two of those are about client-side module support that is actually not mandatory if the implementation provides an embedded language (e.g., ECPG), so there are 12 items left.

So that's not so bad. Here's a list of the missing features:

E081-09 USAGE privilege

This would mean adding a USAGE privilege to domains.

Maybe this isn't very useful, although perhaps those working on SELinux support might have a more qualified opinion on it.  But let's say if we get all the other things done and this is left, this would be a fairly straightforward and well-defined feature to add.

(This would then complete feature E081 Basic Privileges.)

E153 Updatable queries with subqueries

This presupposes updatable views and requires views to be updatable even if their WHERE clause contains a subquery.

This is probably the big one. In the current PostgreSQL architecture, updatable views are apparently quite difficult to implement correctly. The mailing list archives contain plenty of details.

F311-04 CREATE VIEW: WITH CHECK OPTION

This also presupposes updatable views and requires the CHECK OPTION feature. See above.

(This would then complete feature F311 Schema definition statement.)

F812 Basic flagging

This feature means that there should be some implementation-specific facility that raises a notice or warning when a not standard-conforming SQL statement or clause is used. Or in other words a facility that warns when a PostgreSQL extension is used.

A naive implementation might consist of just adding something like elog(WARNING, "not SQL standard") in about five hundred places, but the trick would be to implement it in a way that is easy to maintain in the future. The mailing list archives also contain some discussions about this, key word "SQL flagger".

S011 Distinct data types

This is a way to define user-defined types based on existing types, like
CREATE TYPE new AS old;
Unlike domains, this way the new type does not inherit any of the functions and operators from the old type. This might sound useless at first, but it can actually create better type safety. For example, you could create a type like
CREATE TYPE order_number AS int;
while preventing that someone tries to, say, multiply order numbers.

The implementation effort would probably be similar to that for domains or enums. Also, search the mailing list archives for "distinct types".

(This includes feature S011-01 USER_DEFINED_TYPES view.)

T321 Basic SQL-invoked routines

There are a number of bits missing from fully SQL-compatible SQL function definitions, besides the specific subfeatures mentioned below.
  • Instead of a routine body like AS $$ ... $$, allow one unquoted SQL statement as routine body (see example below under RETURN).
  • LANGUAGE SQL is the default.
  • SPECIFIC xyz clause, allowing the assignment of an explicit "specific routine name" that can be used to refer to the function even when overloaded. Probably not terribly useful for PostgreSQL.
  • DETERMINISTIC / NOT DETERMINISTIC clause. DETERMINISTIC means the same as IMMUTABLE in PostgreSQL; NOT DETERMINSTIC is then STABLE or VOLATILE.
  • CONTAINS SQL / READS SQL DATA / MODIFIES SQL DATA clause. These also appear to overlap with the volatility property in PostgreSQL: MODIFIES would make the function volatile, READS would make it
    STABLE.
Also, for DROP FUNCTION the ability to drop a function by its "specific name" is required:
DROP SPECIFIC FUNCTION specific_name;
There are probably some more details missing, so part of finishing this item would also be some research.

T321-02 User-defined stored procedures with no overloading

Add a new command CREATE PROCEDURE that does that same thing as CREATE FUNCTION .. RETURNS void, and a DROP PROCEDURE command.

T321-04 CALL statement

Add a new command CALL procname() that does the same thing as SELECT procname() but requires procname() to not return a value, meaning it has to be a procedure in the above sense.

T321-05 RETURN statement

Add a new command RETURN callable only from within SQL functions. Then, instead of writing a function like
CREATE FUNCTION name(args) RETURNS type LANGUAGE SQL
AS $$ SELECT something $$;
write
CREATE FUNCTION name(args) RETURNS type LANGUAGE SQL
RETURN something;

That's it! Plus all the stuff I missed, of course. We only have about 2 weeks left(!) until the final commit fest for the 8.5 release, so it's a bit late to tackle these issues now, but maybe for the release after that?

7 comments:

  1. Hello, stored procedures are not only about implementation CALL statement. Stored procedures have to support passing parameters by ref. More, have to have transaction control, and optionally support of unbounded selects.

    ReplyDelete
  2. That might be your wish, but that is not what those specific features in the SQL standard say. There are additional optional features that specify some of what you are saying.

    ReplyDelete
  3. I cannot speak it exactly. My knowledge is based Jim Melton book http://www.amazon.co.uk/Understanding-SQLs-Persistent-Stored-Modules/dp/1558604618 . But is possible so some higher levels are not defined in base, it is defined in SQL/PSM.

    ReplyDelete
  4. I think that F312 MERGE statement would be a great improvement., worth mentioning in your list.

    ReplyDelete
  5. In light of the optional features, stored procedures should be a distinct type than simply void functions; one must disallow pairing a CREATE PROCEDURE with a DROP FUNCTION. Otherwise implementing the optional features will break compatibility.

    ReplyDelete
  6. You missed the uppercase/lowercase identifier discrepency:

    SQL: select x -> select "X"
    PostgreSQL: select x -> select "x"

    ReplyDelete
  7. Consider this a second vote for F312 MERGE. It isn't necessary every day but there are many cases where it would be very handy, and it would make migration from MySQL substantially easier.

    ReplyDelete