Is ID column required in SQL?

江枫思渺然 提交于 2019-11-30 08:25:53

If you really do have some pre-existing column in your data set that already does uniquely identify your row - then no, there's no need for an extra ID column. The primary key however must be unique (in ALL circumstances) and cannot be empty (must be NOT NULL).

In my 20+ years of experience in database design, however, this is almost never truly the case. Most "natural" ID's that appear to be unique aren't - ultimately. US Social Security Numbers aren't guaranteed to be unique, and most other "natural" keys end up being almost unique - and that's just not good enough for a database system.

So if you really do have a proper, unique key in your data already - use it! But most of the time, it's easier and more convenient to have just a single surrogate ID that you can guarantee will be unique over all rows.

gbn

Don't confuse the logical model with the implementation.

The logical model shows a candidate key (all columns) which could makes your primary key.

Great. However...

In practice, having a multi column primary key has downsides: it's wide, not good when clustered etc. There is plenty of information out there and in the "related" questions list on the right

So, you'd typically

  • add a surrogate key (ID column)
  • add a unique constraint to keep the other columns unique
  • the ID column will be the clustered key (can be only one per table)
  • You can make either key the primary key now

The main exception is link or many-to-many tables that link 2 ID columns: a surrogate isn't needed (unless you have a braindead ORM)

Edit, a link: "What should I choose for my primary key?"

Edit2

For many-many tables: SQL: Do you need an auto-incremental primary key for Many-Many tables?

You should have one column in every table that is unique.

EDITED...

This is one of the fundamentals of database table design. It's the row identifier - the identifier identifies which row(s) are being acted upon (updated/deleted etc). Relying on column combinations that are "unique", eg (first_name, last_name, city), as your key can quickly lead to problems when two John Smiths exist, or worse when John Smith moves city and you get a collision.

In most cases, it's best to use a an artificial key that's guaranteed to be unique - like an auto increment integer. That's why they are so popular - they're needed. Commonly, the key column is simply called id, or sometimes <tablename>_id. (I prefer id)

If natural data is available that is unique and present for every row (perhaps retinal scan data for people), you can use that, but all-to-often, such data isn't available for every row.

Ideally, you should have only one unique column. That is, there should only be one key.

Using IDs to key tables means you can change the content as needed without having to repoint things

Ex. if every row points to a unique user, what would happen if he/she changed his name to let say John Blblblbe which had already been in db? And then again, what would happen if you software wants to pick up John Blblblbe's details, whose details would be picked up? the old John's or the one ho has changed his name? Well if answer for bot questions is 'nothing special gonna happen' then, yep, you don't really need "ID" column :]

Important:

Also, having a numeric ID column with numbers is much more faster when you're looking for an exact row even when the table hasn't got any indexing keys or have more than one unique

Yes, you could have many attributes (values) in a record (row) that you could use to make a record unique. This would be called a composite primary key.

However it will be much slower in general because the construction of the primary index will be much more expensive. The primary index is used by relational database management systems (RDBMS) not only to determine uniqueness, but also in how they order and structure records on disk.

A simple primary key of one incrementing value is generally the most performant and the easiest solution for the RDBMS to manage.

If you are sure that any other column is going to have unique data for every row and isn't going to have NULL at any time then there is no need of separate ID column to distinguish each row from others, you can make that existing column primary key for your table.

No, single-attribute keys are not essential and nor are surrogate keys. Keys should have as many attributes as are necessary for data integrity: to ensure that uniqueness is maintained, to represent accurately the universe of discourse and to allow users to identify the data of interest to them. If you have already identified a suitable key and if you don't find any real need to create another one then it would make no sense to add redundant attributes and indexes to your table.

Afz Al

An ID can be more meaningful, for an example an employee id can represent from which department he is, year of he join and so on. Apart from that RDBMS supports lots operations with ID's.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!