What is the best way to enforce a 'subset' relationship with integrity constraints

前端 未结 12 1324
一个人的身影
一个人的身影 2020-12-10 18:11

For example, given 3 tables:

  • gastropod
  • snail
  • slug

and assuming we want to enforce that

  1. every row in \'gastropod\
相关标签:
12条回答
  • 2020-12-10 18:52

    I know this problem as a supertype/subtype issue. I've written about it several times on SO. In this post, it's presented as a solution to a problem with staff, customers, and suppliers. But this post has the most extended discussion behind the rationale and how the constraints work. It's written in terms of online publications.

    0 讨论(0)
  • 2020-12-10 18:58

    Check out this thread: Maintaining subclass integrity in a relational database

    The thread provides multiple suggestions for SQL Server implementations, and I would be surprised if the ideas couldn't be applied to Oracle as well.

    0 讨论(0)
  • 2020-12-10 18:58

    "@Erwin I'd much prefer solutions that do not involve triggers - I have a pathological aversion to them."

    Sorry for new answer, not authorised to add a comment to this.

    As far as I can see, you might be able to get away with "just using deferred constraints" in your particular case, owing to the nature of the constraint you wish to impose. If it works for you, and you are satisfied, then all is OK, no ?

    My main point is, constraints (as in : "any imaginable business rule you might run into as a database designer") can get arbitrarily complex. Think of a genealogy database in which you want to enforce the rule that "no person can be an ancestor of himself, IN WHATEVER DEGREE" (that's my favourite example because it ultimately involves transitive closure and/or recursion). There is NO WAY you can get an SQL DBMS to enforce such rules without using triggers (or without using recursive SQL inside the trigger for that matter too, by the way).

    Neither your DBMS nor I nor anyone else skilled in relational theory will care a Freudian shit about whatever pathologies you happen to have. But perhaps because of these pathologies that you mention, it might be interesting to observe that you can do all of the stuff you want, without having to define any triggers, if you use the DBMS I have developed myself (it does support trigger-like things, but you're not required to resort to them for enforcing data integrity).

    0 讨论(0)
  • 2020-12-10 18:59

    Foreign key referencing gastropod from slug and snail with a unique index on the foreign key columns enforces rules 2 and 3. Rule 1 is trickier though :-(

    The only way I know of to enforce rule 1 is to write some database code that checks snail and slug for the presence of a row.

    By the way - how do you intend to insert data? Whatever order you do it in, you will break a rule.

    0 讨论(0)
  • 2020-12-10 19:01

    All these examples have an atrocious level a complexity for something so simple as:

    create table gastropod(
        average_length numeric
    );
    create table slug(
        like gastropod,
        id          serial  primary key,
        is_mantle_visible boolean
    );
    create table snail(
        like gastropod,
        id          serial  primary key,
        average_shell_volume numeric
    );   
    \d snail;
    
            Column        |  Type   |                     Modifiers                      
    ----------------------+---------+----------------------------------------------------
     average_length       | numeric | 
     id                   | integer | not null default nextval('snail_id_seq'::regclass)
     average_shell_volume | numeric | 
    Indexes:
        "snail_pkey" PRIMARY KEY, btree (id)
    

    Before you say this is not an answer think about the requirements.

    1. every row in 'gastropod' has exactly one corresponding row in 'snail' or 'slug' (but not both)
    2. every row in 'slug' has exactly one corresponding row in 'gastropod'
    3. every row in 'snail' has exactly one corresponding row in 'gastropod'

    Having the column in the table is an equivalence of data integrity without all the nonsense.

    Note: LIKE in the DDL can copy all the columns (even constraints and indexes in 9.0) into the new table. So you can sort of fake inheritance.

    0 讨论(0)
  • 2020-12-10 19:01

    You have two issues here:

    • Presence: there cannot be a parent row without at least one child row.
    • Exclusivity: there cannot be a parent row with more than one child row.

    On a DBMS that supports deferred constraints (including PostgreSQL and Oracle), both of these goals can be achieved declaratively:

    enter image description here

    There is a circular foreign key between gastropod.snail_id and snail.snail_id, and also between gastropod.slug_id and slug.slug_id. There is also a CHECK that ensures exactly one of them matches gastropod.gastropod_id (and the other is NULL).

    To break the chicken-and-egg problem when inserting new data, defer one direction of foreign keys.

    Here is how this would be implemented in PostgreSQL:

    CREATE TABLE gastropod (
        gastropod_id int PRIMARY KEY,
        snail_id int UNIQUE,
        slug_id int UNIQUE,
        CHECK (
            (slug_id IS NULL AND snail_id IS NOT NULL AND snail_id = gastropod_id)
            OR (snail_id IS NULL AND slug_id IS NOT NULL AND slug_id = gastropod_id)
        )    
    );
    
    CREATE TABLE snail (
        snail_id int PRIMARY KEY,
        FOREIGN KEY (snail_id) REFERENCES gastropod (snail_id) ON DELETE CASCADE
    );
    
    CREATE TABLE slug (
        slug_id int PRIMARY KEY,
        FOREIGN KEY (slug_id) REFERENCES gastropod (slug_id) ON DELETE CASCADE
    ); 
    
    ALTER TABLE gastropod ADD FOREIGN KEY (snail_id) REFERENCES snail (snail_id)
        DEFERRABLE INITIALLY DEFERRED;
    
    ALTER TABLE gastropod ADD FOREIGN KEY (slug_id) REFERENCES slug (slug_id)
        DEFERRABLE INITIALLY DEFERRED;
    

    New data is inserted like this:

    START TRANSACTION;
    INSERT INTO gastropod (gastropod_id, snail_id) VALUES (1, 1);
    INSERT INTO snail (snail_id) VALUES (1);
    COMMIT;
    

    However, attempting to insert only parent but not child fails:

    START TRANSACTION;
    INSERT INTO gastropod (gastropod_id, snail_id) VALUES (2, 2);
    COMMIT; -- FK violation.
    

    Inserting the wrong kind of child fails:

    START TRANSACTION;
    INSERT INTO gastropod (gastropod_id, snail_id) VALUES (2, 2);
    INSERT INTO slug (slug_id) VALUES (2); -- FK violation.
    COMMIT;
    

    And inserting setting too few, too many, or mismatched fields in the parent also fails:

    INSERT INTO gastropod (gastropod_id) VALUES (2); -- CHECK violation.
    ...
    INSERT INTO gastropod (gastropod_id, snail_id, slug_id) VALUES (2, 2, 2); -- CHECK violation.
    ...
    INSERT INTO gastropod (gastropod_id, snail_id) VALUES (1, 2); -- CHECK violation.
    

    On a DBMS that doesn't support deferred constraints, exclusivity (but not presence) can be declaratively enforced like this:

    enter image description here

    Under a DBMS that supports calculated fields (such as Oracle 11 virtual columns), the type discriminator type doesn't need to be physically stored at the level of the child tables (only the parent table).

    The unique constraint U1 may be necessary on DBMSes that don't support FK referencing super-set of key (pretty much all of them, as far as I know), so we make this super-set artificially.


    Whether all this should actually be done in practice is another matter. This is one of these situations where enforcing some aspects of data integrity at the application level may be justified by the reduction of overhead and complexity.

    0 讨论(0)
提交回复
热议问题