Deciding between an artificial primary key and a natural key for a Products table

前端 未结 10 1295
粉色の甜心
粉色の甜心 2020-11-29 02:22

Basically, I will need to combine product data from multiple vendors into a single database (it\'s more complex than that, of course) which has several tables that will need

相关标签:
10条回答
  • 2020-11-29 03:04

    In all but the simplest internal situations, I recommend always going for the surrogate key. It gives you options in the future, and protects you from unknowns.

    There's no reason why additional keys, like an SKU, couldn't be made non-null to enforce them, but at least by removing your reliance on third-parties you're giving yourself the option to choose, rather than having it taken from you and enduring a painful rewrite at a later stage.

    Whether you go for the auto-incremented integer or determine the next primary key yourself, there will be complications. With the auto-incremented method, you can insert the record easily and let it assign its own key, but you may have trouble identifying exactly what key your record was given (and getting the max key isn't guaranteed to return yours).

    I tend to go for the self-assigned key because you have more control and, in sql server, you can retrieve your key from a central keys table and ensure nobody else gets the same key, all in one statement:

    DECLARE @Key INT
    
    UPDATE  KeyTable
    WITH    (rowlock)
    SET @Key = LastKey = LastKey + 1
    WHERE   KeyType = 'Product'
    

    The table records the last key used. The sql above increments that key directly in the table and returns the new key, ensuring its uniqueness.

    Why you should avoid alphanumeric primary keys:

    Three main problems: performance, collation and space.

    Performance - there is a performance cost though, like Razzie below, I can't quote any numbers, but it is less efficient to index alphanumerics than numbers.

    Collation - your developers may create the same key with different collations in different tables (it happens) which leads to constantly using the 'collate' commands when joining these tables in queries and that gets old really quickly.

    Space - a nine-character SKU like David's takes nine bytes, but an integer takes only four (2 for smallint, 1 for tinyint). Even a bigint takes only 8 bytes.

    0 讨论(0)
  • 2020-11-29 03:08

    This is a choice between surrogate and natural primary keys.

    IMHO always favour surrogate primary keys. Primary keys shouldn't have meaning because that meaning can change. Even country names can change and countries can come into existence and disappear, let alone products. Changing primary keys is definitely not advised, which can happen with natural keys.

    More on surrogate vs primary keys:

    So surrogate keys win right? Well, let’s review and see if any of the con’s of natural key’s apply to surrogate keys:

    • Con 1: Primary key size – Surrogate keys generally don't have problems with index size since they're usually a single column of type int. That's about as small as it gets.
    • Con 2: Foreign key size - They don't have foreign key or foreign index size problems either for the same reason as Con 1.
    • Con 3: Asthetics - Well, it’s an eye of the beholder type thing, but they certainly don’t involve writing as much code as with compound natural keys.
    • Con 4 & 5: Optionality & Applicability – Surrogate keys have no problems with people or things not wanting to or not being able to provide the data.
    • Con 6: Uniqueness - They are 100% guaranteed to be unique. That’s a relief.
    • Con 7: Privacy - They have no privacy concerns should an unscrupulous person obtain them.
    • Con 8: Accidental Denormalization – You can’t accidentally denormalize non-business data.
    • Con 9: Cascading Updates - Surrogate keys don't change, so no worries about how to cascade them on update.
    • Con 10: Varchar join speed - They're generally int's, so they're generally as fast to join over as you can get.

    And there's also Surrogate Keys vs Natural Keys for Primary Key?

    0 讨论(0)
  • 2020-11-29 03:17

    I'd also go with an auto-increment primary key. The performance impact for having an alphanumeric primary key are there, though I don't dare name any numbers. However, if performance is important in your application, all the more reason to go with the autoincrement primary key column.

    0 讨论(0)
  • 2020-11-29 03:18

    A surrogate key (auto increment INT field) will uniquely identify a row in the table. On the other hand, a Unique Natural key (productName) will prevent duplicate product data from entering the table.

    With a unique Natural key field, two or more rows can never have same data.

    With a surrogate key field, Rows can be unique because of the auto increment INT field but data in rows will not be unique because the surrogate key has no relation to the data.

    Lets take an example of a User table, the table's Natural key field (userName) will prevent same user from registering twice but the auto increment INT field (userId) will not.

    0 讨论(0)
提交回复
热议问题