mySQL - Should I denormalise?

问题

Overview (Sorry its vague - I think if I went into more detail it would just over complicate things)

I have three tables, table one contains an id, table two contains its own id and table one's id and table three contains its own id and table two's id.

I have spent a lot of time pondering and I think it would be more efficient for table three to also contain the related table ones id.

-It will mean I will not have to join three tables, I can just query table three (for a query that will be used very often)

-It will allow me to implement a reservation system more easily by only locking rows within table three that contain a specific id from table one.

For anyone who wants to know more about the database layout there is more info here

Question

What are the disadvantaged to de-normalisation? I have seen some people who are completely against it and others who believe in the right situation it is a useful tool. The id's will never change so I do not really see any disadvantage other than having to insert the same data twice and thus the additional space it will consume (which as it is just id's will surely be negligible).

回答1:

My advice is to follow this general rule: Normalise by default, then denormalise if and when you identify a performance problem which it will solve.

I find normalised data, and code dealing with it, easier and more logical to maintain. I don't think there is any problem using denormalisation to improve performance, but I would not speculatively apply any performance optimisation which results in a decrease in maintainability until you are sure they are necessary.

回答2:

The only time you really want to denormalize is if its required to get the performance you want

This was already asked several times. See here

回答3:

As its a one (Table 1) to many (Table 2), with another one (table 2) to many (Table 3) I would keep the same structure as their seems to be 3 layers there.

e.g.

Table 1
- Table 2
  - Table 3

Also, a lot will depend on what additional fields you are storing within those tables.

回答4:

Every rule might be broken if there is a good reason for it.

In your case I wonder what the three tables contain. Does Table three really describe Table two or does it describe table one directly?

The disadvantage to have self-id, table-two-id and table-one-id in table three in this case is, that it can lead to inconsistence - what if you have table-one-id 1 in table two and table-one-id 15 in table three by a mistake?

It depends on the data and the entity relationship of your data. For me, it would be more important to have no inconsistencies and to have a little bit more time at selection...

EDIT: After reading about your Tables I would suggest to add a table-one-id to table three (areas), because table-one-id doesn't change after all and for that reason its relatively save for inconsistency.

回答5:

Normalization vs efficiency is usually a trade-off, while normalization is generally a good thing, it is not a silver bullet. If you have a clear reason (as it seems you do), denormalization is perfectly acceptable.

回答6:

Schemas containing less than fully normalized tables suffer from what is called "harmful redundancy". Harmful redundancy can result in storing the same fact in more than one place, or in not having any place to store a fact that needs to be stored. These problems are known as "insert anomalies", "update anomalies", or "delete anomalies".

To make a long story short, if you store a fact in more than one place, then sooner or later you are going to store mutually contradictory facts in the two places, and your database will begin to give contradictory answers, depending on which version of the facts the query found.

If you are forced to "invent a dummy record" in order to have a place to store a needed fact, then sooner or later you are going to write a query that mistakenly treats the dummy record like a real one.

If you are a super programmer, and you never make mistakes, then you don't have to worry about the above. I never met such a programmer, although I've met lots of people who think they never make mistakes.

I would refrain from "denormalizing" as a practice. That's like "driving away from Chicago". You still don't know where you are going. However, there are times when normalization rules should be disregarded, as others have noted. If you are designing a star schema (or a snowflake schema) you are going to have to disregard some of the normalization rules in order to get the best star (or snowflake).

来源：https://stackoverflow.com/questions/1729191/mysql-should-i-denormalise

标签

mysql

database

database-design

denormalization