What exactly does database normalization do?

后端 未结 5 1739
北恋
北恋 2020-12-03 05:26

New to database and so no to get upset with simple questions. As far as my googled and gathered knowledge normalization reduces redundancy of data and increase the performan

5条回答
  •  一整个雨季
    2020-12-03 06:00

    Database normalisation is, at its simplest, a way to minimise data redundancy. To achieve that, certain forms of normalisation exist.

    First normal form can be summarised as:

    • no repeating groups in single tables.
    • separate tables for related information.
    • all items in a table related to the primary key.

    Second normal form adds another restriction, basically that every column not part of a candidate key must be dependent on every candidate key (a candidate key being defined as a minimal set of columns which cannot be duplicated in the table).

    And third normal form goes a little further, in that every column not part of a candidate key must not be dependent on any other non-candidate-key column. In other words, it can depend only on the candidate keys. This leads to the saying that 3NF depends on the key, the whole key and nothing but the key, so help me Codd1.

    Note that the above explanations are tailored toward your question rather than database theorists, so the descriptions are necessarily simplified (and I've used phrases like "summarised as" and "basically").

    The field of database theory is a complex one and, if you truly wish to understand it, you'll eventually have to get to the science behind it. But, in terms of your question, hopefully this will be adequate.

    Normalization is a valuable tool in ensuring we don't have redundant data (which becomes a real problem if the two redundant areas get out of sync). It doesn't generally increase performance.

    In fact, although all database should start in 3NF, it's sometimes acceptable to drop to 2NF for performance gains, provided you're aware of, and mitigate, the potential problems.

    And be aware that there are also "higher" levels of normalisation such as (obviously) fourth, fifth and sixth, but also Boyce-Codd and some others I can't remember off the top of my head. In the vast majority of cases, 3NF should be more than enough.


    1 If you don't know who Edgar Codd (or Christopher Date, for that matter) is, you should probably research them, they're the fathers of relational database theory.

提交回复
热议问题