What is the resource impact from normalizing a database?

后端 未结 8 529
南方客
南方客 2020-12-10 16:58

When taking a database from a relatively un-normalized form and normalizing it, what, if any, changes in resource utilization might one expect?

For example,

相关标签:
8条回答
  • 2020-12-10 17:26

    I wanted to elaborate on Henrik Opel's #3 bullet point. Development costs might go up, but they don't have to. In fact, normalization of a database should simplify or enable the use of tools like ORMs, Code Generators, Report Writers, etc. These tools can significantly reduce the time spent on the data access layer of your applications and move development on through to adding business value.

    You can find a good StackOverflow discussion here about the development aspect of normalized databases. There were many good answers, comments and things to think about.

    0 讨论(0)
  • 2020-12-10 17:31

    To underscore some points made by prior posters: Is you current schema really denormalized? The proper way (imho) to design a database is to:

    • Understand as best you can the system/information to be modeled
    • Build a fully normalized model
    • Then, if and as you find it necessary, denormalize in a controlled fashion to enhance performance

    (There may be other reasons to denormalize, but the only ones I can think of off-hand are political ones--have to match the existing code, the developers/managers don't like it, etc.)

    My point is, if you never fully normalized, you don't have a denormalized database, you've got an unnormalized one. And I think you can think of more descriptive if less polite terms for those databases.

    0 讨论(0)
  • 2020-12-10 17:33

    Normalized schemas tend to perform better for INSERT/UPDATE/DELETE because there are no "update anomalies" and the actual changes that need to be made are more localized.

    SELECTs are mixed. Denormalization is esentially materializing a join. There's no doubt that materializing a join sometimes helps, however, materialization is often very pessimistic (probably more often than not), so don't assume that denormalization will help you. Also, normalized schemas are generally smaller and therefore might require less I/O. A join is not necessarily expensive, so don't automatically assume that it will be.

    0 讨论(0)
  • 2020-12-10 17:35

    "Normalization" applies only and exclusively to the logical design of a database.

    The logical design of a database and the physical design of a database are two completely distinct things. Database theory has always intended for things to be this way. The fact that the developers who overlook/disregard this distinction (out of ignorance or out of carelessness or out of laziness or out of whatever other so-called-but-invalid "reason") are the vast majority, does not make them right.

    A logical design can be said to be normalized or not, but a logical design does not inherently carry any "performance characteristic" whatsoever. Just like 'c:=c+1;' does not inherently carry any performance characteristic.

    A physical design does determine "performance characteristics", but then again a physical design simply does not have the quality of being "normalized or not".

    This flawed perception of "normalization hurting performance" is really nothing else than concrete proof that all the DBMS engines that exist today are just seriously lacking in physical design options.

    0 讨论(0)
  • 2020-12-10 17:35

    I've found that normalization, in some cases, will improve performance.

    Small tables read more quickly. A badly denormalized database will often have (a) longer rows and (b) more rows than a normalized design.

    Reading fewer shorter rows means less physical I/O.

    0 讨论(0)
  • 2020-12-10 17:37

    This can not really be answered in a general manner, as the impact will vary heavily depending on the specifics of the database in question and the apps using it.

    So you basically stated the general expectations concerning the impact:

    1. Overall memory demands for storage should go down, as redundant data gets removed
    2. CPU needs might go up, as queries might get more expensive (Note that in many cases queries on a normalized database will actually be faster, even if they are more complex, as there are more optimization options for the query engine)
    3. Development resource needs might go up, as developers might need to construct more elaborate queries (But on the other hand, you need less development effort to maintain data integrity)

    So the only real answer is the usual: it depends ;)

    Note: This assumes that we are talking about cautious and intentional denormalization. If you are referring to the 'just throw some tables together as data comes along' approach way to common with inexperienced developers, I'd risk the statement that normalization will reduce resource needs on all levels ;)


    Edit: Concerning the specific context added by cdeszaq, I'd say 'Good luck getting your point through' ;)

    Oviously, with over 300 Tables and no constraints (!), the answer to your question is definitely 'normalizing will reduce resource needs on all levels' (and probably very substantially), but:

    Refactoring such a mess will be a major undertaking. If there is only one app using this database, it is already dreadful - if there are many, it might become a nightmare!

    So even if normalizing would substantially reduce resource needs in the long run, it might not be worth the trouble, depending on circumstances. The main questions here are about long term scope - how important is this database, how long will it be used, will there be more apps using it in the future, is the current maintenance effort constant or increasing, etc. ...

    Don't ignore that it is a running system - even if it's ugly and horrible, according to your description it is not (yet) broken ;-)

    0 讨论(0)
提交回复
热议问题