star schema design - one column dimensions

时光毁灭记忆、已成空白 提交于 2019-12-23 03:19:27

问题


I`m new to data warehousing, but I think my question can be relatively easy answered. I built a star schema, with a dimension table 'product'. This table has a column 'PropertyName' and a column 'PropertyValue'. The dimension therefore looks a little like this:

surrogate_key | natural_key (productID) | PropertyName | PropertyValue | ...
    1              5                          Size           20          ...
    2              5                          Color          red
    3              6                          Size           20
    4              6                          Material       wood

and so on.

In my fact table I always use the surrogate keys of the dimensions. Cause of the PropertyName and PropertyValue columns my natural key isn`t unique / identifying anymore, so I get way too much rows in my fact table.

My question now is, what should I do with the property columns? Would it be best, to put each property into separate dimensions, like dimension size, dimension color and so on? I got about 30 different properties. Or shall I create columns for each property in the fact table? Or make one dimension with all properties?

Thanks in advance for any help.


回答1:


Your dimension table 'product' should look like this:

surrogate_key | natural_key (productID) | Color | Material | Size | ...
    1              5                      red     wood       20     ...
    2              6                      red     ...         

If you have to many properties, try to group them in another dimension. For example Color and Material can be attributes of another dimension if you can have the same product with same id and same price in another color or material. Your fact table can identify product with two keys: product_id and colormaterial_id...

Reading recommendation: The Data Warehouse Toolkit, Ralph Kimball




回答2:


Your design is called EAV (entity-attribute-value) table.

It's a nice design for the sparse matrices (large number of properties with only few of them filled at the same time).

However, it has several drawbacks.

  • It cannot be indexed (and hence efficiently searched) on two or more properties at once. A query like this: "get all products made of wood and having size or 20" will be less efficient.

  • Implementing constraints involving several attributes at once is more complex

  • etc.

If it's not a problem for you, you can use EAV design.



来源:https://stackoverflow.com/questions/3710182/star-schema-design-one-column-dimensions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!