in general, should every table in a database have an identity field to use as a PK?

前端 未结 10 1888
旧时难觅i
旧时难觅i 2020-12-07 15:36

This seems like a duplicate even as I ask it, but I searched and didn\'t find it. It seems like a good question for SO -- even though I\'m sure I can find it on many blogs e

相关标签:
10条回答
  • 2020-12-07 16:21

    If you have modelled, designed, normalised etc, then you will have no identity columns.

    You will have identified natural and candidate keys for your tables.

    You may decide on a surrogate key because of the physical architecture (eg narrow, numeric, strictly monotonically increasing), say, because using a nvarchar(100) column is not a good idea (still need unique constraint).

    Or because of ideology: they appeal to OO developers I've found.

    Ok, assume ID columns. As your db gets more complex, say several layers, how can you jon parent and grand-.child tables directly. You can't: you always need intermediate tables and well indexed PK-FL columns. With a composite key, it's all there for you...

    Don't get me wrong: I use them. But I know why I use them...

    Edit:

    I'd be interested to collate "always ID"+"no stored procs" matches on one hand, with "use stored procs"+"IDs when they benefit" on the other...

    0 讨论(0)
  • 2020-12-07 16:22

    There are two concepts that are close but should not be confused: IDENTITY and PRIMARY KEY

    Every table (except for the rare conditions) should have a PRIMARY KEY, that is a value or a set of values that uniquely identify a row.

    See here for discussion why.

    IDENTITY is a property of a column in SQL Server which means that the column will be filled automatically with incrementing values.

    Due to the nature of this property, the values of this column are inherently UNIQUE.

    However, no UNIQUE constraint or UNIQUE index is automatically created on IDENTITY column, and after issuing SET IDENTITY_INSERT ON it's possible to insert duplicate values into an IDENTITY column, unless it had been explicity UNIQUE constrained.

    The IDENTITY column should not necessarily be a PRIMARY KEY, but most often it's used to fill the surrogate PRIMARY KEYs

    It may or may not be useful in any particular case.

    Therefore, the answer to your question:

    The question: should every table in a database have an IDENTITY field that's used as the PK?

    is this:

    No. There are cases when a database table should NOT have an IDENTITY field as a PRIMARY KEY.

    Three cases come into my mind when it's not the best idea to have an IDENTITY as a PRIMARY KEY:

    • If your PRIMARY KEY is composite (like in many-to-many link tables)
    • If your PRIMARY KEY is natural (like, a state code)
    • If your PRIMARY KEY should be unique across databases (in this case you use GUID / UUID / NEWID)

    All these cases imply the following condition:

    You shouldn't have IDENTITY when you care for the values of your PRIMARY KEY and explicitly insert them into your table.

    Update:

    Many-to-many link tables should have the pair of id's to the table they link as the composite key.

    It's a natural composite key which you already have to use (and make UNIQUE), so there is no point to generate a surrogate key for this.

    I don't see why would you want to reference a many-to-many link table from any other table except the tables they link, but let's assume you have such a need.

    In this case, you just reference the link table by the composite key.

    This query:

    CREATE TABLE a (id, data)
    CREATE TABLE b (id, data)
    CREATE TABLE ab (a_id, b_id, PRIMARY KEY (a_id, b_id))
    CREATE TABLE business_rule (id, a_id, b_id, FOREIGN KEY (a_id, b_id) REFERENCES ab)
    
    SELECT  *
    FROM    business_rule br
    JOIN    a
    ON      a.id = br.a_id
    

    is much more efficient than this one:

    CREATE TABLE a (id, data)
    CREATE TABLE b (id, data)
    CREATE TABLE ab (id, a_id, b_id, PRIMARY KEY (id), UNIQUE KEY (a_id, b_id))
    CREATE TABLE business_rule (id, ab_id, FOREIGN KEY (ab_id) REFERENCES ab)
    
    SELECT  *
    FROM    business_rule br
    JOIN    a_to_b ab
    ON      br.ab_id = ab.id
    JOIN    a
    ON      a.id = ab.a_id
    

    , for obvious reasons.

    0 讨论(0)
  • 2020-12-07 16:26

    Recognize the distinction between an Identity field and a key... Every table should have a key, to eliminate the data corruption of inadvertently entering multiple rows that represent the same 'entity'. If the only key a table has is a meaningless surrogate key, then this function is effectively missing.

    otoh, No table 'needs' an identity, and certainly not every table benefits from one... Examples are: A table with a short and functional key, a table which does not have any other table referencing it through a foreign Key, or a table which is in a one to zero-or-one relationship with another table... none of these need an Identity

    0 讨论(0)
  • 2020-12-07 16:30

    Every table should have some set of field(s) that uniquely identify it. Whether or not there is a numeric identifier field separate from the data fields will depend on the domain you are attempting to model. Not all data easily falls into the 'single numeric id' paradigm, and as such it would be inappropriate to force it. Given that, a lot of data does easily fit in this paradigm and as such would call for such an identifier. There is no one answer to always do X in any programming environment, and this is another example.

    0 讨论(0)
提交回复
热议问题