How to store value objects in a relational database?

我怕爱的太早我们不能终老 提交于 2020-01-14 03:18:08

问题


I am working with a large project that has many objects that represent simple (non-related) values. Sometimes these values are a single string, sometimes they are two strings, sometimes a string and an int...

Currently we have a 'values' table in our relational database that contains the columns: Id, Category, String1, String2 ..., Int1, Int2 ..., Double1 etc. It's convenient, but a mess.

The values all have the following properties:

  • Every object with the same Category has the same attributes (ie. is typed).
  • No objects are related (the only key is the Id primary key).

How do we navigate out of this mess? As I see it, our options are as follows:

  1. Just keep adding columns as necessary and forget about semantic mapping between table and object. Just pile it on.
  2. Create a new table for every value object. This will add a large number of tables to the database, most of which will have less than 6 rows. I'm worried about the noise that all these extra tables adds to the database.
  3. Deploy a schema-free database just for these objects (not really a possibility with our deployment scenarios).
  4. Create a table of with Id, Category columns and a BLOB Value column and serialize the value objects into the value column. Is this viable?

This post restates our options. Are there any caveats or pitfalls to using serialization? Is there an option I'm not aware of? Advice most welcome.


回答1:


I stumbled upon this by navigating from another relevant question. Although it's quite old, I was intrigued to answer because it not only poses a very well stated problem but also allows one to argument on database denormalization as a whole.

There are many reasons and even more excuses for a database to be denormalized. Performance might be the most important, but difficulty in data classification (such as the issue at hand) is definitely the most common. Moreover, there are many ways a database can be denormalized, and a good deal of them are addressed by the OP.

Fact is, though, that a database should be denormalized as a last resort, after everything else has failed. The reasons for that include:

  • Data become meaningless to humans as well as the RDBMS. It's hard for someone to understand, or even remember, the purpose of a field named Integer1 or a serialized value which can potentially hold anything. And the RDBMS cannot extract values from serialized entities in order to sort results or apply aggregates.

  • Maintaining a volatile schema is hard. There's a reason why a database schema should be constant. Other, higher levels depend on it. If the schema changes overnight, applications should change too, to reflect the new status. Even worse, views, stored procedures and other dependant database components become equally difficult to maintain.

  • Constraints cannot be enforced, indexes cannot be created. There's no point defining a serialized field as a foreign key, or confine it to a specific set of values. This cancels a great deal of the database's self-protect mechanisms. Less data integrity means more administrative cost. Moreover, an index would be equally useless here, making the table less open to optimization.

  • Metadata will have to, eventually, be stored as data. Imagine a multilingual CMS in which there's a main article table to hold articles. Now, for every language supported, there's a corresponding article_{lang} table to hold translations (i.e. article_en, article_fr, article_es etc). In order to record the existing translations of articles, a "relation" table should be created, with a foreign key to the article table, a language id, a table name for the translation table and a field that should be a FK to the tranlation table but cannot be defined as one. Then, try to write a query that counts the available translations for each article!

So aviod denormalization as much as possible. If entities can be classified to an extent, then IS-A relations could be the answer. To support arbitary attributes, or when classification is just not worthwhile, a key/value pair table, with a foreign key to the table holding normalized data, is more than enough a sacrifice.



来源:https://stackoverflow.com/questions/15650898/how-to-store-value-objects-in-a-relational-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!