Denormalizing for sanity or performance?

我们两清 提交于 2019-12-04 03:37:25

I don't know if I would even call what you want to do denormalization -- it looks more like you just want to replace artificial foreign keys (StateId, AgencyId) with natural foreign keys (State Abbreviation, Agency Code). Using varchar fields instead of integer fields will slow down join/query performance, but (a) if you don't even need to join the table most of the time because the natural FK is what you want anyway it's not a big deal and (b) your database would need to be pretty big/have a high load for it to be noticeable.

But djna is correct in that you need a complete understanding of current and future needs before making a change like this. Are you SURE the three letter agency codes will never change, even five years from now? Really, really sure?

Some denormalization can be needed for performance (and sanity) reasons at some times. Hard to tell wihout seeing all your tables / needs etc...

But why not just build a few convenience views (to do a few joins) and then use these to be able to write simpler queries?

Beware of wanting to shape things to your current idioms. Right now the unfamiliar code seems unweildy and obstructive to your understanding. In time it's possible that you will become acclimatised.

If current (or known future) requirements, such as performance are not being met then that's a whole different issue. But remember anything can be performance tuned, the objective is not to make things as fast as possible, but to make them fast enough.

David

This previous post dealt with a similar issue to the one you're having. Hopefully it will be helpful to you.

Dealing with "hypernormalized" data

My own personal take on normalization is to normalize as much as possible, but denormalize only for performance. And evn the denormalization for performance is something to avoid. I'd go the route of profiling,setting correct indexes, etc before I'd denormalize.

Sanity... That's overrated. Especially in our profession.

Well, what about the performance? If the performance is okay, just make the five table JOIN into a view and, for sanity, SELECT from the view when you need the data.

State abbreviations are one of the cases in which I think meaningful keys are okay. For very simple lookup tables with a limited number of rows and where I'm in complete control of the data (meaning it's not populated from some outside source) I'll sometimes create meaningful four or five character keys so that the key value can proxy for the fully descriptive lookup value in some queries.

Create a view (or inline table-valued function to get parameterization). In any case, I usually put all my code into SPs (some code generated) whether they use views or not and that's that, you pretty much only ever write the join once.

An argument (for this "normalization") that the three-letter codes might change isn't very compelling without a plan for what you will do if the codes do change, and how your artificial-key scenario will address this eventuality better than using the codes as keys. Unless you've implemented a fully temporal schema (which is horribly difficult to do and not suggested by your example), it's not obvious to me how your normalization benefits you at all. Now if you work with agencies from multiple sources and standards that might have colliding code names, or if "state" might eventually mean a two-letter code for state, province, department, canton, or estado, that's another matter. You then need your own keys or you need a two-column key with more information than that code.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!