I\'m trying to figure out the \"best practices\" for deciding whether or not to add an auto-incrementing integer as the primary key to a table.
Let\'s say I have a t
With regards to using ISBN and SSN you really have to Think about how many rows in other tables are going to reference these through foreign keys because those ids will take up much more space than an integer and thus may lead to a waste of disk space and possibly to worse join performance.
There are a lot of already addressed questions on Stack Overflow that can help you with your questions. See here, here, here and here.
The term you should be looking for: surrogated keys.
Hope it helps.
Old topic I know, but one other thing to consider is that given that most RDBMSes lay out blocks on disk using the PK, using an auto-incrementing PK will simply massively increase your contention. This may not be an issue for your baby database you're mucking around with, but believe me it can cause massive performance issues at the bigger end of town.
If you must use an auto-incrementing ID, maybe consider using it as part of a PK. Tack it on the end to maintain uniqueness.....
Also, it is best to exhaust all possibilities for natural PKs before jumping to a surrogate. People are generally lazy with this.
You've got the idea right there.
Auto-increment should be used as a unique key when no unique key already exists about the items you are modelling. So for Elements you could use the Atomic Number or Books the ISBN number.
But if people are posting messages on a message board then these need a unique ID, but don't contain one naturally so we assign the next number from a list.
It make sense to use natural keys where possible, just remember to make the field as the primary key and ensure that it is indexed for performance
I'm trying to figure out the "best practices" for deciding whether or not to add an auto-incrementing integer as the primary key to a table.
Use it as a unique identifier with a dataset where the PKey is not part of user managed data.
Let's say I have a table containing data about the chemical elements. The atomic number of each element is unique and will never change. So rather than using an auto-incrementing integer for each column, it would probably make more sense to just use the atomic number, correct?
Yes.
Would the same be true if I had a table of books? Should I use the ISBN or an auto-incrementing integer for the primary key? Or a table of employees containing each person's SSN?
ISBNs/SS#s are assigned by third-parties and because of their large storage size would be a highly inefficient way to uniquely identify a row. Remember, PKeys are useful when you join tables. Why use a large data format like an ISBN which would be numerous textual characters as the Unique identifier when a small and compact format like Integer is available?
The main problem that I have seen with the auto incrementing an integer approach is when you export your data to bring into another db instance, or even an archive and restore operation. Because the integer has no relation to the data that it references, there is no way to determine if you have duplicates when restoring or adding data to an existing database. If you want no relationship between the data contained in the row and the PK, I would just use a guid. Not very user friendly to look at, but it solves the above problem.