relationships between 3 entities in ER diagram--is a ternary enough or are 2 binaries also needed?

一世执手 提交于 2019-11-28 12:25:48

TL;DR You need all three relationship types/tables. Because if you drop one then in some situations you lose data--there is no way to use the remaining ones to answer all the same questions.

Normalization to higher NFs (normal forms) tells us when we can replace a relationship/table by smaller/simpler ones.


Each relationship table holds the rows that participate in the relationship. We can describe the relationship via a predicate (statement template):

1 Divides_to holds (T, P) rows where project P divides to task T
2 Has holds (E, P) rows where employee E is assigned to project P
3 holds (E, T, P) rows where employee E is assigned to task T on project P

Can we drop 1? If we ignore employees in 3 then we get rows where some employee is assigned to task T on project P. But (per above) that is not the rows in 1. Maybe project P divides to task T in 1 but no employee is assigned to task T on project P; then that row in 1 is not a subrow in 3. And there is no task info in 2. So we can't use 3 & 2 to replace 1.

Can we drop 2? Similarly, if we ignore tasks in 3 then we don't get 2, and there is no employee info in 1. So we can't use 3 & 1 to replace 2.

Can we drop 3? Using 1 & 2 we can get rows where employee E is assigned to project P AND project P divides to task T. But (per above) that is not the rows in 3. They differ if an employee assigned to a project isn't assigned to all its tasks or if a task of a project doesn't have all its employees assigned to it. There's no other way to generate 3 from 1 & 2. So we can't use 1 & 2 to replace 3.

So we need all three relationships.

Normalization to higher NFs guides decomposing a relationship into simpler others by which it can be expressed instead.


PS 1 That's also why we need the entity types/tables and not just the relationship types/tables. (If we didn't want them anyway for entity-specific attributes or just ER modeling conventions.) Eg the three relationships can't tell you about employees that aren't assigned to a project or to a task & project. Similarly for tasks & for projects.

PS 2 We ignore an attribute in relational algebra by not projecting on it. We ignore a column in SQL by not selecting it. The result's predicate is that FOR SOME value for the attribute/column, the old predicate holds. Relational natural join gives the rows whose relationship/predicate is the AND of input relationships/predicates. In SQL for no duplicate rows & no shared nullable columns that's select distinct from natural join.

PS 3 Your design satisfies certain constraints: If a task-project pair appears in 3 then it must appear in 1 and if an employee-project pair appears in 3 then it must appear in 2. One way to reflect that in ER modeling is by reifying the task-project & employee-project relationships to associative entities then replacing 3 by a what ER calls a binary relationship on the entities. Relationally, the relationship/table is still ternary, where certain subrows happen to identify those entities. A way to get a constrained relationally binary 3 is to add an employee-project PK (primary key) or CK (candidate key) id in 2 and replace the composite FK (foreign key) in 3 by such an id. Then we have a binary on entities and on values. Some pseudo-ER methods do this.

PS 4 This style of (true Chen) ER diagram doesn't typically use SQL nulls. But as it happens you could replace all three relationships by a variant of 3 with nulls. You would null-extend the binary relations and union them with the ternary. As usual, nulls complicate predicates. Usually we add a nullable column as an alternative to adding a separate table sharing a null-free CK (candidate key). But this is different, without the savings in space or joins; it just complicates things. (Including important constraints.)

    E IS NULL
AND task T is of project P
AND NOT EXISTS E [employee E is assigned to task T of project P]
OR  T IS NULL
AND employee E is assigned to project P
AND NOT EXISTS T [employee E is assigned to task T of project P]
OR  employee E is assigned to task T of project P

(Also it's problematic in SQL because SQL unique, primary key & join are not the relational things by those names because they treat null specially.)

PS 5 Some answers of mine re such ternary vs binary relation(ship) types/tables/predicates:
Should this ER diagram use a ternary relationship instead
Best Solution - Ternary or Binary Relationship
Why can't you just join in fan trap?
And re design & predicates:
Modeling multiple many to many relationships between the same entities in a relational database
What is the difference between an entity relationship model and a relational model?
Is there any rule of thumb to construct SQL query from a human-readable description?

PS 6 Has is an unhelpfully generic relationship name/meaning/table. Use meaningful names like Is_assigned_to or Assignment.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!