Deciding between an artificial primary key and a natural key for a Products table

前端 未结 10 1294
粉色の甜心
粉色の甜心 2020-11-29 02:22

Basically, I will need to combine product data from multiple vendors into a single database (it\'s more complex than that, of course) which has several tables that will need

相关标签:
10条回答
  • You could always take a hash of the SKU which would get rid of the alphas. You'd have to code for possible collisions (which should be very rare) which is an added complication.

    I'd use the hash to populate the primary key and make the inital import easy but when using it in the dB always treat it as if it were a random number. That way the primary key will loose it's meaning (and have all the advantages of an auto-incremented key) allowing flexibility in the future.

    0 讨论(0)
  • 2020-11-29 02:54

    Pretty similar to my question a few months ago...

    Should I have a dedicated primary key field?

    I went with an auto-incrementing PK in the end.

    0 讨论(0)
  • 2020-11-29 02:58

    The ever present danger with natural keys is that either your initial assumptions will be proven wrong now or in the future when some change is made outside your control, or at some place you'll need to reference a record where passing a meaningful field is not desired (ex. a web application that uses an employee's social security number as the primary key, and then has to use urls like /employee.php?ssn=xxxxxxx)

    From my own personal experience with "unique" SKU's and vendor data feeds - are you absolutely sure they are sending you a feed with complete, unique, well formed SKUs?

    I've had to personally deal with all of the following when getting feeds from vendors who have varying levels of IT and clerical competence:

    • Products are missing their SKU entirely ("")
    • Clerks have used placeholder SKUs in their database like 999999999 and 00000000 and never corrected them
    • Those doing the data entry or importation have confused between various product numbers, mixing up things like UPC with SCC, or even finding ways to mangle them together (I've seen SCC codes with impossible check digits at the end, because they just copied the UPC and added 01 or 10, without correcting the check digit)
    • For special reasons, or just incompetence, the vendor has entered the same product twice in their database (for example rev. 1 and rev. 2 of the same motherboard have the same SKU, but exist as 2 records in the vendors database and data feed because rev 2. has new features)
    0 讨论(0)
  • 2020-11-29 02:58

    Since you're dealing with data from multiple vendors outside of your control, I would use a surrogate key. You don't want to have to rearchitect your database design one day when one of them happens to send you a duplicate.

    0 讨论(0)
  • 2020-11-29 03:00

    I'd advice on having an autoincremented "meaningless" integer as primary key. Should someone come up with the idea of reorganizing product IDs, at least your DB won't get into trouble.

    0 讨论(0)
  • 2020-11-29 03:01

    If every product will have a SKU and the SKU is unique to each product, I don't see why you wouldn't want to use that for a possible primary key.

    0 讨论(0)
提交回复
热议问题