How to apply complex constraints to a database table in MySQL?

问题

As I am in the final stages of setting up a database for one of my projects, I have thought of an additional constraint that would need to be added to the Task table (see image below), but I am not sure how this can be implemented in MySQL.

Original Database Schema (without markups): Click here.

Database Schema with Markups:

For each job (job), a WBS Code List (wbscodelist) is assigned. Each of these lists contain a number of WBS Codes (wbscodeitem) that apply to that job. An example would be:

Job A uses WBS Code List #1
Job B uses WBS Code List #2
Job C uses WBS Code List #1
etc.

WBS Code List #1 has codes: [100, 105, 110, 115, 120]
WBS Code List #2 has codes: [2180, 2190]
etc.

At the moment, task.fk_wbsCodeItemID is a foreign key of wbscodeitem.wbsCodeItemID (marked up in orange).

The problem that I am facing here is that a task could potentially use a WBS Code that does not apply to that job.

I would like to include a further constraint to task.fk_wbsCodeItemID so that the values it can take are dependent on wbscodeitem.fk_wbsCodeListID and job.wbscodeitem.fk_wbsCodeListID being equal for that job (marked up in red).

How can I include this constraint within MySQL for this database schema? Would this issue possibly be due to the current design of this database (and would I need to change it)?

I understand this may require a little more detail, so I can include further details or clarify if necessary.

回答1:

One way to do it is via controlled redundancy. You can denormalize the functional dependencies jobNumber -> fk_wbsCodeListID into joblocation and task, and use composite FK constraints to prevent inconsistencies. Similarly, the functional dependency wbsCodeItemID -> fk_wbsCodeListID can be denormalized into task. The overlapping composite FK constraints in task will then enforce your requirement:

CREATE TABLE `wbscodelist` (
  `wbsCodeListID` int(11) NOT NULL,
  `description` varchar(45) NOT NULL,
  PRIMARY KEY (`wbsCodeListID`)
) ENGINE=InnoDB;

CREATE TABLE `wbscodeitem` (
  `wbsCodeItemID` int(11) NOT NULL,
  `wbsCode` varchar(10) NOT NULL,
  `description` varchar(50) NOT NULL,
  `notes` varchar(80) NOT NULL,
  `fk_wbsCodeListID` int(11) NOT NULL,
  PRIMARY KEY (`wbsCodeItemID`),
  UNIQUE KEY (`wbsCodeItemID`,`fk_wbsCodeListID`),
  KEY (`fk_wbsCodeListID`),
  FOREIGN KEY (`fk_wbsCodeListID`) REFERENCES `wbscodelist` (`wbsCodeListID`) ON UPDATE CASCADE
) ENGINE=InnoDB;

CREATE TABLE `job` (
  `jobNumber` varchar(15) NOT NULL,
  `jobName` varchar(45) NOT NULL,
  `fk_wbsCodeListID` int(11) NOT NULL,
  `isActive` bit(1) NOT NULL,
  PRIMARY KEY (`jobNumber`),
  UNIQUE KEY (`jobNumber`,`fk_wbsCodeListID`),
  KEY (`fk_wbsCodeListID`),
  FOREIGN KEY (`fk_wbsCodeListID`) REFERENCES `wbscodelist` (`wbsCodeListID`) ON UPDATE CASCADE
) ENGINE=InnoDB;

CREATE TABLE `joblocation` (
  `jobLocationID` int(11) NOT NULL,
  `roomNumber` varchar(25) NOT NULL,
  `fk_jobNumber` varchar(15) NOT NULL,
  `fk_wbsCodeListID` int(11) NOT NULL,
  PRIMARY KEY (`jobLocationID`),
  KEY (`fk_jobNumber`,`fk_wbsCodeListID`),
  KEY (`jobLocationID`,`fk_wbsCodeListID`),
  FOREIGN KEY (`fk_jobNumber`, `fk_wbsCodeListID`) REFERENCES `job` (`jobNumber`, `fk_wbsCodeListID`) ON UPDATE CASCADE
) ENGINE=InnoDB;

CREATE TABLE `task` (
  `taskID` int(11) NOT NULL,
  `fk_JobLocationID` int(11) NOT NULL,
  `fk_JobNumber` varchar(15) NOT NULL,
  `fk_wbsCodeItemID` int(11) NOT NULL,
  `fk_wbsCodeListID` int(11) NOT NULL,
  PRIMARY KEY (`taskID`),
  KEY (`fk_wbsCodeItemID`,`fk_wbsCodeListID`),
  KEY (`fk_JobLocationID`,`fk_wbsCodeListID`),
  FOREIGN KEY (`fk_JobLocationID`, `fk_wbsCodeListID`) REFERENCES `joblocation` (`jobLocationID`, `fk_wbsCodeListID`) ON UPDATE CASCADE,
  FOREIGN KEY (`fk_wbsCodeItemID`, `fk_wbsCodeListID`) REFERENCES `wbscodeitem` (`wbsCodeItemID`, `fk_wbsCodeListID`) ON UPDATE CASCADE
) ENGINE=InnoDB;

Note the composite indexes to match the composite FK constraints.

An alternative option is to create triggers to check that the inserted/updated FK values are related via joins:

DELIMITER ;;

CREATE TRIGGER check_task_insert BEFORE INSERT ON task
    FOR EACH ROW
    BEGIN
        IF NOT EXISTS (
            SELECT 1
            FROM joblocation loc
            JOIN job ON loc.fk_jobNumber = job.jobNumber
            JOIN wbscodeitem itm ON job.fk_wbsCodeListID = itm.fk_wbsCodeListID
            WHERE loc.jobLocationID = new.fk_jobLocationID
            AND itm.wbsCodeItemID = new.fk_wbsCodeItemID
        ) THEN
            SIGNAL SQLSTATE '45000'   
            SET MESSAGE_TEXT = 'fk_wbsCodeItemID doesn\'t match fk_wbsCodeListID of associated job';
        END IF;
    END;
;;

CREATE TRIGGER check_task_update BEFORE UPDATE ON task
    FOR EACH ROW
    BEGIN
        IF NOT EXISTS (
            SELECT 1
            FROM joblocation loc
            JOIN job ON loc.fk_jobNumber = job.jobNumber
            JOIN wbscodeitem itm ON job.fk_wbsCodeListID = itm.fk_wbsCodeListID
            WHERE loc.jobLocationID = new.fk_jobLocationID
            AND itm.wbsCodeItemID = new.fk_wbsCodeItemID
        ) THEN
            SIGNAL SQLSTATE '45000'   
            SET MESSAGE_TEXT = 'fk_wbsCodeItemID doesn\'t match fk_wbsCodeListID of associated job';
        END IF;
    END;
;;

DELIMITER ;

回答2:

Can you explicitly constrain tasks to reference only code items applicable to their respective job? Yes. The question is more whether you need to do that at the database level, and whether the added complexity to your schema is worth it.

Which codes are available to which tasks is practically a textbook example of a business logic concern, and while it's not unheard of to incorporate business logic into the schema like you're suggesting, it's getting close to the boundary of what relational databases are best at. The database provides structure, and can make basic guarantees as to its consistency by enforcing direct relationships; what you have is an indirect relationship (task to code items associated with job), which you will need to either make direct or enforce by validating your inputs to the database.

To enforce the constraint in your schema, you'll need a new table wbscodelist_items which acts as a junction table between wbscodelist and wbscodeitem, replacing the one:many relationship defined by wbscodeitem.fk_wbsCodeListID with a many:many relationship. This opens its own can of worms -- now you can have code items on multiple lists, which may not be what you want! But task can reference wbscodelist_items which ensures that it cannot have a code item which is not on the correct list.

Even with all that, there's still no guarantee that the list belongs to the task's job. Simply giving wbscodelist_items a foreign key to job won't cut it -- you'll wind up with two duplicate relationships (tasks to jobs via joblocation and wbscodelist_items, and code items to jobs via wbscodelist_items and wbscodelist) which are completely unchecked and so could be inconsistent. In order to enforce the relationship without giving yourself even worse structural problems, you'd have to roll wbscodelist into wbscodelist_items and joblocation into task, duplicating data and generally making life more difficult anyway.

All told, I'd keep your structure and just validate your inputs.

回答3:

I have focused only on key attributes, and have in total six relations (tables). All constraints are solved by PK and FK, no need for triggers. Admittedly I did not quite understand the concept of code_list, if you need a list to easily manage sets of codes, than you may add another relation for that.

Note:

[Px]   = predicate x
[cx.y] = constraint x.y


PK  = PRIMARY KEY
AKn = ALTERNATE KEY (UNIQUE)
FKn = FOREIGN KEY

[P1] Job (job_ID) exists.

[c1.1] Job is identifed by job_ID.

job {job_ID} -- P1
 PK {job_ID} -- c1.1

[P2] Location (loc_ID) exists.

[c2.1] Location is identifed by loc_ID.

 
location {loc_ID} -- P2
      PK {loc_ID} -- c2.1

[P3] Job (job_ID) is assigned to location (loc_ID).

[c3.1] Each job may be assigned to more than one location; for each location that location may have more than one assigned job.

[c3.2] If a job is assigned to a location then that job must exist.

[c3.3] If a job is assigned to a location then that location must exist.

job_location {job_ID, loc_ID} -- P3
          PK {job_ID, loc_ID} -- c3.1

FK1 {job_ID}  REFERENCES job {job_ID}      -- c3.2
FK2 {loc_ID}  REFERENCES location {loc_ID} -- c3.3

[P4] WBS code (wbs_ID) exists.

[c4.1] WBS code is identifed by wbs_ID.

wbs_code {wbs_ID} -- P4
      PK {wbs_ID} -- c4.1

[P5] Job (job_ID) is assigned wbs code (wbs_ID).

[c5.1] Each job may be assigned more than one wbs code; for each wbs code that wbs code may be assigend to more than one job.

[c5.2] If a job is assigned a wbs code, then that job must exist.

[c5.3] If a job is assigned a wbs code, then that wbs code must exist.

job_wbs {job_ID, wbs_ID} -- P5
     PK {job_ID, wbs_ID} -- c5.1

FK1 {job_ID} REFERENCES job {job_ID}      -- c5.2
FK2 {wbs_ID} REFERENCES wbs_code {wbs_ID} -- c5.3

[P6] Taks number (job_task_No) of Job (job_ID) is performed at location (loc_ID), with assigned wbs code (wbs_ID).

[c6.1] Task is identified by combination of job_ID and job_task_No.

[c6.2] If a task of a job is performed, then that job must exist.

[c6.3] If a task of a job is performed at a location, then that job must be assigend to that location.

[c6.4] If a task of a job is performed with assigend wbs code, then that wbs code must be assigned to that job.

task {job_ID, job_task_No, loc_ID, wbs_ID} -- P6
  PK {job_ID, job_task_No}                 -- c6.1

FK1 {job_ID} REFERENCES job {job_ID}                          -- c6.2
FK2 {job_ID, loc_ID} REFERENCES job_location {job_ID, loc_ID} -- c6.3
FK3 {job_ID, wbs_ID} REFERENCES job_wbs {job_ID, wbs_ID}      -- c6.4

回答4:

The solution is fairly simple. It just takes a little tweaking with the design.

It looks like a job location can host only one job but a job could be spread out over several locations. Also a list may contain several code items and an item can appear on only one list with each list associated with only one job.

Let's work with that for a moment.

create table Jobs(
  ID           int  primary key,
  Name         vachar( 45 ),
  IsActive     bit
);

create table Locations(
  ID           int primary key,
  RoomNum      varchar( 25 )
);

create table JobLocations(
  LocID        int references Locations( ID ),
  JobID        int unique references Jobs( ID ),
  constraint PK_JobLocations primary key( LocID, JobID )
);

create table Items(
  ID           int primary key,
  Code         varchar( 10 ),
  Description  varchar( 50 ),
  Notes        varchar( 80 )
);

create table ItemLists(
  ID           int primary key,
  Description  varchar( 45 )
);

create table ListItems(
  ListID       int references ItemList( ID ),
  ItemID       int unique references Items( ID ),
  constraint PK_ListItems primary key( ListID, ItemID )
);

Now each list must be associated with a job.

create table JobLists(
  JobId        int references Jobs( ID ),
  ListID       int references ItemLists( ID ),
  constraint PK_JobLists primary key( JobID, listID )
);

Now you have a task which associates with one item for a particular job at a particular location. In order to have complete data integrity, the task must refer to a job, its location, a list associated with the job and then an item which appears on the list. This means you have to add two fields to the Tasks table.

create table Tasks(
  ID           int primary key,
  LocID        int,
  JobID        int,
  ListID       int,
  ItemID       int,
  constraint FK_TaskJobLoc( LocID, JobID ) references JobLocations( LocID, JobID ),
  constraint FK_TaskJobList( JobID, ListID ) references JobLists( JobID, ListID ),
  constraint FK_TaskListItem( ListID, ItemID ) references ListItems( ListID, ItemID )
);

Now you've guaranteed that the list must be associated with a job at a specific location and the item must be associated with the list. Thus the item must be associated with the job.

I've thrown this together rather quickly so check the complete chain of references carefully. However, you can see that if you want to refer to an object at the head of a reference chain and an object at the end of that chain, you have to use the intermediate references.

来源：https://stackoverflow.com/questions/42571843/how-to-apply-complex-constraints-to-a-database-table-in-mysql

标签

mysql

database

database-design

relational-database