How to implement a uniqueness constraint across multiple tables?

Discussion:

Kynn Jones

2014-07-31 18:38:15 UTC

I want to implement something akin to OO inheritance among DB tables. The
idea is to define some "superclass" table, e.g.:

CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);

CREATE TABLE sub_1 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

CREATE TABLE sub_2 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

...

CREATE TABLE sub_n (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

I cribbed this pattern from pp. 92-93 of Bill Kirwan's "SQL Antipatterns:
Avoiding the pitfalls of database programming". The approach has a
weakness, however, (which the author does not make sufficiently clear) and
that is that, as presented above, it would be possible for multiple "sub"
records (each from a different "sub_k" table) to refer to the same "super"
record, and this may not be consistent with the semantics of some
applications.

Does PostgreSQL have a good way to enforce the uniqueness of super_id
values across multiple tables?

(BTW, one could use PostgreSQL built-in support for table inheritance to
implement something very much like the scheme above. Unfortunately, as
explained in the documentation, there's no built-in support yet for
enforcing uniqueness across multiple subclass tables.)

Thanks in advance!

kj

PS: I'm sure that the problem described above crops up frequently, and that
one could find much material about it on the Web, but my online searches
have been hampered (I think) by my not having adequate search keywords for
it. I'd be interested in learning keywords to facilitate researching this
topic.

Marti Raudsepp

2014-07-31 19:16:45 UTC

Permalink

Does PostgreSQL have a good way to enforce the uniqueness of super_id values
across multiple tables?

Well that's easy: no.

Regards,
Marti

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Rob Sargent

2014-07-31 19:24:14 UTC

Permalink

Post by Marti Raudsepp

Does PostgreSQL have a good way to enforce the uniqueness of super_id values
across multiple tables?

Well that's easy: no.
Regards,
Marti

That might be a little hasty. There are conditional definitions clauses
in indices. One could apply a check such that all subs don't have the
id or on each sub such that the others don't have the id. But can be
made unnecessary if inserts to all subs work off same id generator.

rjs

Rob Sargent

2014-07-31 19:19:49 UTC

Permalink

Post by Kynn Jones
I want to implement something akin to OO inheritance among DB tables.
CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);
CREATE TABLE sub_1 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
CREATE TABLE sub_2 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
...
CREATE TABLE sub_n (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
I cribbed this pattern from pp. 92-93 of Bill Kirwan's "SQL
Antipatterns: Avoiding the pitfalls of database programming". The
approach has a weakness, however, (which the author does not make
sufficiently clear) and that is that, as presented above, it would be
possible for multiple "sub" records (each from a different "sub_k"
table) to refer to the same "super" record, and this may not be
consistent with the semantics of some applications.
Does PostgreSQL have a good way to enforce the uniqueness of super_id
values across multiple tables?
(BTW, one could use PostgreSQL built-in support for table inheritance
to implement something very much like the scheme above.
Unfortunately, as explained in the documentation, there's no built-in
support yet for enforcing uniqueness across multiple subclass tables.)
Thanks in advance!
kj
PS: I'm sure that the problem described above crops up frequently, and
that one could find much material about it on the Web, but my online
searches have been hampered (I think) by my not having adequate search
keywords for it. I'd be interested in learning keywords to facilitate
researching this topic.

Wouldn't this be a problem only if new subn() could/would re-use an id?
if new sub() generates a unique id, there would be no chance of two subn
entries having the same id.

rjs

Kynn Jones

2014-08-01 13:04:45 UTC

Permalink

Post by Rob Sargent
Wouldn't this be a problem only if new subn() could/would re-use an id?
if new sub() generates a unique id, there would be no chance of two subn
entries having the same id.

I'd thought that the ids of the sub_k tables were never generated
independently, but rather they must pre-exist as values of the super_id
column of the super table. After reading your post though, I'm no longer
sure that this is what Kirwan had in mind... (He does not give any details
at all on how the IDs should be created.)

If I understand you correctly, there should be a "business rule"
*somewhere* that says that entries in the super table must *always* be
created following these steps:

1. generate a new (unique) ID from a sequence super_seq;
2. insert a new entry in the super table having this (necessarily
unique) ID in its super_id column;
3. insert a new entry in some some sub_k table, having this ID in its
super_id column;
4. (somehow) disallow any subsequent updating of the super_id field of
this newly-added sub_k table (although it could still be OK to delete a
record from the super table, and cascade this to the appropriate record in
some sub_k table).

I'm sure this sort of thing could be implemented in PostgreSQL, though I'd
be hard-pressed to fill in the details. How much of this can be specified
in the definitions (CREATE TABLE ...) of the tables? For example, (1)
could be taken care of by defining the super_id column of the super table
as a SERIAL. I imagine that (2) and (3) would have to be encapsulated in a
"stored procedure" . Can (4) be implemented in the definitions of the
tables?

Thanks!

Kevin Grittner

2014-07-31 19:52:11 UTC

Permalink

Post by Kynn Jones
I want to implement something akin to OO inheritance among DB
    CREATE TABLE super (
        super_id INT PRIMARY KEY,
        ...
        -- other columns
    );
    CREATE TABLE sub_1 (
        super_id INT PRIMARY KEY,
        FOREIGN KEY (super_id) REFERENCES super(super_id),
        ...
        -- other columns
    );
    CREATE TABLE sub_2 (
        super_id INT PRIMARY KEY,
        FOREIGN KEY (super_id) REFERENCES super(super_id),
        ...
        -- other columns
    );
    ...
    CREATE TABLE sub_n (
        super_id INT PRIMARY KEY,
        FOREIGN KEY (super_id) REFERENCES super(super_id),
        ...
        -- other columns
    );
it would be possible for multiple "sub" records (each from a
different "sub_k" table) to refer to the same "super" record, and
this may not be consistent with the semantics of some
applications.
Does PostgreSQL have a good way to enforce the uniqueness of
super_id values across multiple tables?

This goes beyond the capabilities of declarative constraints to
enforce. You can enforce it using triggers, but you need to handle
race conditions, which is not easy with MVCC behavior (where reads
don't block anything and writes don't block reads). There are
basically two ways to cover that:

(1) You can introduce blocking. This can be done with LOCK TABLE
statements, but that tends to be a pretty crude tool for this.
You might be able to make clever use of transactional advisory
locks. Or you could have a child_count column in the super table
which is maintained by "AFTER EACH ROW" triggers for INSERT and
DELETE.

(2) You can ensure that all transactions which could affect this
invariant use the SERIALIZABLE transaction isolation level. The
triggers can then check that there is not a matching row in more
than one "sub" table without worrying about the race conditions
(beyond automatically retrying a transaction which throws a
serialization failure). An example of handling something vaguely
similar using SERIALIZABLE transactions is here:

http://wiki.postgresql.org/wiki/SSI#FK-Like_Constraints

You should probably review this entire chapter in the
documentation:

http://www.postgresql.org/docs/current/interactive/mvcc.html

By the way, I saw exactly this pattern in a financial accounting
system for courts. The super table had common information for all
financial transactions, and there were separate sub tables for
assessments, receipts, checks, etc. We had a "transaction type"
code column in the super table to specify *which* of the sub tables
should be populated for each row in the super table. I don't know
whether you have anything like that, but I think the issues are
similar either way -- perhaps a little simpler with such a code
than without.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Kynn Jones

2014-08-01 13:05:54 UTC

Permalink

On Thu, Jul 31, 2014 at 3:52 PM, Kevin Grittner <***@ymail.com> wrote:

This goes beyond the capabilities of declarative constraints to

enforce. You can enforce it using triggers, but you need to handle
race conditions, which is not easy with MVCC behavior (where reads
don't block anything and writes don't block reads). There are

<snip>

Thanks for the detailed response. It gives me much to work with/follow-up
on.

Pujol Mathieu

2014-08-01 07:18:44 UTC

Permalink

Hi,
Maybe you can use inheritance.
CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);

CREATE TABLE sub_template (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id) UNIQUE,
);

CREATE TABLE sub_1 (
-- other columns
) INHERITS (sub_template);

CREATE TABLE sub_2 (
-- other columns
) INHERITS (sub_template);

So the foreign key constraint will be on the sub_template avoiding two
row of sub_x to reference the same foreign key.
This is just an idea I let you check for syntax.
http://www.postgresql.org/docs/9.3/static/ddl-inherit.html
Regards,
Mathieu

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

David G Johnston

2014-08-01 07:28:52 UTC

Permalink

Post by Pujol Mathieu

Post by Kynn Jones
I want to implement something akin to OO inheritance among DB tables.
(BTW, one could use PostgreSQL built-in support for table inheritance
to implement something very much like the scheme above.
Unfortunately, as explained in the documentation, there's no built-in
support yet for enforcing uniqueness across multiple subclass tables.)

Maybe you can use inheritance.
So the foreign key constraint will be on the sub_template avoiding two
row of sub_x to reference the same foreign key.
This is just an idea I let you check for syntax.
http://www.postgresql.org/docs/9.3/static/ddl-inherit.html
Regards,
Mathieu

You should read Section 5.8.1 (Caveats) of the page your referenced. Or the
"BTW" in the OP which reiterates the salient points.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-implement-a-uniqueness-constraint-across-multiple-tables-tp5813448p5813503.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Pujol Mathieu

2014-08-01 07:47:45 UTC

Permalink

Post by David G Johnston

Post by Pujol Mathieu

You should read Section 5.8.1 (Caveats) of the page your referenced. Or the
"BTW" in the OP which reiterates the salient points.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-implement-a-uniqueness-constraint-across-multiple-tables-tp5813448p5813503.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

You are right, I didn't saw that.
So ignore my previous message. Or maybe for future release like the
documentation says.
Mathieu
--
Mathieu PUJOL
Ingénieur Réalité Virtuelle
REAL FUSIO - 3D Computer Graphics
10, rue des arts - 31000 TOULOUSE - FRANCE
***@realfusio.com - http://www.realfusio.com

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Alban Hertroys

2014-08-01 07:58:48 UTC

Permalink

Post by Kynn Jones
CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);
CREATE TABLE sub_1 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
CREATE TABLE sub_2 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
...
CREATE TABLE sub_n (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);
I cribbed this pattern from pp. 92-93 of Bill Kirwan's "SQL Antipatterns: Avoiding the pitfalls of database programming". The approach has a weakness, however, (which the author does not make sufficiently clear) and that is that, as presented above, it would be possible for multiple "sub" records (each from a different "sub_k" table) to refer to the same "super" record, and this may not be consistent with the semantics of some applications.
Does PostgreSQL have a good way to enforce the uniqueness of super_id values across multiple tables?

Not in and of itself, but if you change the pattern a little you can have uniqueness:

CREATE TABLE super (
super_id INT,
— Add a type to the PK
type text,
PRIMARY KEY (super_id, type),
...
-- other columns
);

CREATE TABLE sub_1 (
super_id INT,
— Constrain the records in a sub-table to have a specific type
type text CHECK (type = ’sub_1’),
PRIMARY KEY (super_id, type),
FOREIGN KEY (super_id, type) REFERENCES super(super_id, type),
...
-- other columns
);

etc.

You still won’t have a unique super_id, but the combination of (super_id, type) will be unique.

Unfortunately, this approach breaks (again) if you would want to allow for multiple inheritance. You could fix that by keeping multiple levels of “type”, using multiple type-columns or perhaps an array, but that gets ugly fast.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general