Enforce Uniqueness of Related Entities

问题

In a relational database (SQL), I have a parent entity that can have 0..n related child entities. The parent entity is uniquely identified in part by its collection of related child entities, such that I should not be able to have two similar parents with the same collection of children.

So I could have Parent 1 with Child 1 and Child 2, and Parent 2 with Child 2 and Child 3, but I cannot have another parent with Child 2 and Child 3.

Ideally, I would like to enforce this uniqueness using a database constraint. I've considered storing a hash of all child records with the parent, but was wondering if there was an easier / more standard way of accomplishing this.

Any ideas?

回答1:

This kind of constraint is tricky because SQL has no relational equality operator, i.e. no simple way of evaluting A=B where A and B are sets of rows. Standard SQL does support nested tables but unfortunately SQL Server does not.

One possible answer is a predicate like the following, which checks for any identical families in a table:

NOT EXISTS (
    SELECT 1
    FROM family f, family g
    WHERE f.child = g.child
    AND f.parent <> g.parent
    GROUP BY f.parent, g.parent
    HAVING COUNT(*) = (SELECT COUNT(*) FROM family WHERE parent = f.parent)
    AND COUNT(*) = (SELECT COUNT(*) FROM family WHERE parent = g.parent)
    )

Notice that this query doesn't attempt to deal with childless families. In set-theoretic terms two empty sets are necessarily identical. If you want to allow for childless families then you would have to decide whether two childless families should be deemed identical or not.

SQL is not a truly relational language and it falls well short of what a relational language ought to be capable of. Tutorial D is an example of a real relational language that does support relational equality and relation-valued attributes. In Tutorial D you can in principle represent each family as a value of a single attribute in a relvar. That family attribute can also be a key and therefore duplicate families would not be allowed.

回答2:

Thanks for the help from those who suggested using a trigger. This is roughly what I have and seems to be working.

CREATE TRIGGER [dbo].[trig_Parent_Child_Uniqueness]
ON [dbo].[Parent_Child]
AFTER INSERT, UPDATE
AS
BEGIN
    IF EXISTS (
        SELECT 1
        FROM Parent p1
        --Compare each pair of parents
        JOIN Parent p2 ON p1.ParentId <> p2.ParentId
        WHERE NOT EXISTS (
            --Find any children that are different
            SELECT 1
            FROM (
                SELECT ChildId FROM Parent_Child c1
                WHERE c1.ParentId = p1.ParentId
            ) as c1
            FULL OUTER JOIN (
                SELECT ChildId FROM Parent_Child c2
                WHERE c2.ParentId = p2.ParentId
            ) as c2 ON c2.ChildId = c1.ChildId
            WHERE c1.ChildId IS NULL OR c2.ChildId IS NULL
        )
    ) ROLLBACK;
END;

EDIT: Or a better solution, adapted from @sqlvogel

CREATE TRIGGER [dbo].[trig_Parent_Child_Uniqueness]
ON [dbo].[Parent_Child]
AFTER INSERT, UPDATE
AS
BEGIN
    IF EXISTS (
        SELECT 1
        FROM Parent_Child p1
        FULL JOIN Parent_Child p2 ON p1.ParentId <> p2.ParentId
            AND p1.ChildId = p2.ChildId
        GROUP BY p1.ParentId
        HAVING COUNT(p1.ParentId) = COUNT(*) 
            AND COUNT(p2.ParentId) = COUNT(*)
    ) ROLLBACK;
END;

回答3:

This is a bit yucky as it includes triggers and cursors :(

It includes a column in the parent table which is based upon the children

Set up:

CREATE TAble Parent
(
    Id INT  Primary Key,
    Name VARCHAR(50),
    ChildItems VARCHAR(200) NOT NULL UNIQUE
)
CREATE TABLE Child
(
    Id INT Primary Key,
    Name VARCHAR(50)
)

CREATE TABLE ParentChild
(
    Id INT Identity Primary Key,
    ParentId INT,
    ChildId Int
)

Triggers

-- This gives the unique colmn a default based upon the id of the parent
CREATE TRIGGER trg_Parent ON Parent
INSTEAD OF Insert
AS
    SET NOCOUNT ON
    INSERT INTO Parent (Id, Name, ChildItems)
    SELECT Id, Name, '/' + CAST(Id As Varchar(10)) + '/'
    FROM Inserted
GO

-- This updates the parent with a path based upon child items
-- If a the exact same child items exist for another parent then this fails
-- because of the unique index

CREATE Trigger trg_ParentChild ON ParentChild
AFTER Insert, Update
AS
    DECLARE @ParentId INT = 0
    DECLARE @ChildItems VARCHAR(8000)    = ''

    DECLARE parentCursor CURSOR FOR 
        SELECT DISTINCT ParentId
        FROM Inserted

    OPEN parentCursor
    FETCH NEXT FROM parentCursor INTO @ParentId 

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SELECT @ChildItems =   COALESCE(@ChildItems + '/ ', '') + CAST(ChildID As Varchar(10))
        FROM ParentChild
        WHERE ParentId = @ParentId
        ORDER BY ChildId

        UPDATE Parent
            SET ChildItems = @ChildITems
        WHERE Id = @ParentId

        FETCH NEXT FROM parentCursor INTO @ParentId 
        SET @ChildItems = ''
    END
    CLOSE parentCursor
    DEALLOCATE parentCursor

GO

Data Setup

INSERT INTO Parent (Id, Name)
VALUES (1, 'Parent1'), (2,'Parent2'), (3, 'Parent3')



INSERT INTO Child (Id, Name)
VALUES (1,'Child1'), (2,'Child2'), (3,'Child3'), (4,'Child4')

Now insert some data

-- This one succeeds
INSERT INTO ParentChild (ParentId, ChildId)
VALUES (1,1),(1,2),(2,2),(2,3)

-- This one Fails
INSERT INTO ParentChild (ParentId, ChildId) VALUES (3,1),(3,2)

来源：https://stackoverflow.com/questions/45863606/enforce-uniqueness-of-related-entities

标签

sql-server

database-design

relational-database