Can you move compound keys and/or foreign keys to other tables when normalizing to 3NF (third normal form)

问题

My database design is currently at stage 3NF. The issue that is of concern is the foreign keys and in some cases compound keys.

My question is this:

Can you move compound keys and/or foreign keys to create other tables provided the attributes associated with the compound keys/foreign keys do not rely on the primary key?

I think the answer is yes due to this link:

Are Foreign Keys included in Third Normal Form?
Best Answer: Just because it's a foreign key doesn't mean it also can't be considered an attribute of the primary key. The fact that it's a foreign key to begin with implies it's defining a relationship with another table, and thus would not violate [...] 3NF.
-- TheMadProfessor
https://answers.yahoo.com/question/index?qid=20081117095121AAXWBbX#

This leads me to wonder whether my current normalization stage that I am processing is actually in 3NF.

I cannot post my tables in regards to this question as I am new and do not have enough points.

回答1:

I'm not sure that I understand your question. If you are asking, "Can you have a foreign key in a table without violating 3NF?" the answer is absolutely, positively yes. Nothing about any stage of normalization says that you should eliminate foreign keys. Indeed, it's pretty much impossible to normalize all but the most trivial data without using foreign keys.

** Update **

Okay, maybe now I understand your question, but then I think you've answered it for yourself. Yes, in a fully-normalized DB, you should not have non-key dependencies. If you have FKs that are not dependent on the PK, then they should be moved to another table.

To make a simple example, suppose you want to keep track of people, the city they live in, and the country that that city is in. So for your first draft, you make this structure: (Asterisk marks the PK.)

Person (person_id*, person_name, city_id, country_id)
City (city_id*, city_name)
Country (country_id*, country_name)

This is not normalized. A city is in the same country regardless of what resident of that city we are talking about. Paris is not in France when we are talking about Pierre but in Germany when we are talking about Francois. (If there are two cities with the same name, of course those are different cities and should have different records. I suppose a city could cross national boundaries, but for our purposes here let's assume that if it does, we consider it two cities that happen to touch. They would surely have different city governments, different postal systems, etc.) So we have a non-key dependency. country_id depends on city_id, not on person_id.

So to normalize this schema, we should move the country_id to a table where it is dependent solely on the PK. Presumably, the City table:

Person (person_id*, person_name, city_id)
City (city_id*, city_name, country_id)
Country (country_id*, country_name)

回答2:

Preamble

In pure relational database theory, there is nothing to stop you having composite primary keys (PKs), and you can have foreign keys (FKs) that reference them and those FKs are necessarily composite too. Some software has difficulty with composite keys, so you often find that people add an ID column which contains an automatically generated number, which is then designated as the PK of the table. Other tables can then have (simple) FKs that reference the (simple) ID column. One not uncommon mistake is to forget that the original composite PK is still a candidate key (CK), and its uniqueness should be enforced by the DBMS with a unique constraint on the table; it becomes an alternative key (AK).

Diversion

The system of CKs, AKs and PKs works like this:

Every CK is a set of (one or more) columns that is a unique identifier for the data in the rest of each row of data in a table.
One CK may be designated as the PK.
The other CKs become AKs.

Consider this table:

CREATE TABLE elements
(
    atomic_number   INTEGER NOT NULL PRIMARY KEY
                    CHECK (atomic_number > 0 AND atomic_number < 120),
    symbol          CHAR(3) NOT NULL UNIQUE,
    name            CHAR(20) NOT NULL UNIQUE,
    atomic_weight   DECIMAL(8,4) NOT NULL,
    period          SMALLINT NOT NULL
                    CHECK (period BETWEEN 1 AND 7),
    group           CHAR(2) NOT NULL
                    -- 'L' for Lanthanoids, 'A' for Actinoids
                    CHECK (group IN ('1', '2', 'L', 'A', '3', '4', '5', '6',
                                     '7', '8', '9', '10', '11', '12', '13',
                                     '14', '15', '16', '17', '18')),
    stable          CHAR(1) DEFAULT 'Y' NOT NULL
                    CHECK (stable IN ('Y', 'N'))
);

Each of atomic_number, symbol and name is a candidate key. For chemistry, the symbol is most convenient as the primary key; for physics, the atomic_number is most convenient. The tables related to isotopes etc reference the atomic_number column, but the tables related to chemical compounds reference the symbol column. The three CKs here are all simple; on the other hand, the isotopes table has a compound PK consisting of the atomic number of the element (the number of protons) and the number of neutrons.

Answer

Getting back to your question, your data may well be in 3NF, or more likely BCNF (which is formally stronger than 3NF).

You'd have to show us your table schemas and specify the constraints (functional dependencies, etc) that apply to the columns before we could assess your design. But there is nothing that you've described which, a priori, prevents it from being well normalized.

回答3:

Do you understand what a FD (functional dependency) is? A FD is an expression with an arrow between two attribute sets. Given a table and a FD, saying that the FD holds in the table or the table has the FD or the table satisfies the FD says all subrows for the first attribute set appear with the same subrow for the second attribute set.

Do you understand that normalization involves CKs (candidate keys)? Independent of normalization we can call one CK a PK (primary key) and the others AKs (alternate keys).

Do you understand what a FK (foreign key) is? Given a database, a table and an attribute list, saying that the list is a FK referencing some attribute list in some table says that every list of values for the attributes in the first table is also a list of of values for the attributes in the second table where those attributes form a CK (candidate key).

Normalization uses FDs & JDs (join dependencies) to replace a table by projections of it that join back to it. FKs (foreign keys) are irrelevant to normalization. That's both to whether a table is in a given NF (normal form) and to decomposing to a given NF. The answer you link to is also saying that FKs are irrelevant to whether a table is in 3NF--first for a specific case, then for the general case.

Because components share column values from an original table, some FKs will arise from normalization. Just as various tables and their PKs, AKs, CKs, superkeys, FDs & JDs will. That is normalization output, not input.

Since normalization replaces tables by projections of them, column sets in common among the original & components must contain the same subrow values. That's an EQD (equality dependency). Subrow values that form CKs will thus have FKs to them arise from normalization.

But often during normalization we see that we want to replace a table by some that are projections with others that have at least the subrows of a projection. Common subrows in a projection must then appear in expanded components, but not vice versa. That's an IND (inclusion dependency). There will still be a FK when common columns form a CK in a table that is a superset of a projection. Such a design change isn't normalization, it is just a change that you noticed during normalization from a design that was wrong in not allowing all the possible business situation to be recorded to a design that doesn't have that problem.

"Compound" vs "simple" vs "empty" apply to FKs, PKs, AKs, CKs (candidate keys), & superkeys and are relevant to FD & JD parts. Definitions will specify compound/simple/empty when it matters. (Eg: Sometimes FDs are put into canonical forms involving single columns. Sometimes we can easily infer NFs hold partly based on whether CKs are simple.)

After you get your tables, declare (sufficient) constraints. Then the DBMS can enforce them. FKs, PKs, AKs, CKs, superkeys, FDs & JDs all have associated constraints. SQL lets you declare all PKs & AKs (via PRIMARY KEY & UNIQUE NOT NULL). Those declarations actually declare superkeys which happen to be CKs/PKs/AKs when no smaller ones are declared within them. Similarly SQL FOREIGN KEY declares a foreign superkey that is a FK if it is actually to a CK. Declare sufficient chains of FKs to enforce the ones you don't declare. (Via transitivity.) SQL DBMSs typically won't let you declare FK cycles. SQL also makes you declare superkeys referenced by FK declarations whether or not those columns contain a smaller declared superkey/CK and so must be a superkey. Declare or enforce via triggers any FD or JD constraints that aren't implied by CK constraints. (5NF gets rid of all such constraints except some cycles of FDs on CKs.)

Find academic textbook definitions and algorithms.

来源：https://stackoverflow.com/questions/12322080/can-you-move-compound-keys-and-or-foreign-keys-to-other-tables-when-normalizing

标签

foreign-keys

relational-database

database-normalization

3nf