How to calculate the likely size of an OLAP cube

冷暖自知 提交于 2019-12-10 17:09:26

问题


Does anyone know a method to use to get a rough size of an OLAP cube based on a star schema data warehouse. Something based on the number of dimensions, the number of records in the dimension tables and the number of fact records and finally the number of aggregations or distinct records etc..

The database I am looking at has a fact table of over 20 billion rows and a few dimension tables of 20 million, 70 million and 1.3 billion rows.

Thanks Nicholas


回答1:


I can see some roadblocks to creating this estimate. Knowing the row counts and cardinalities of the dimension tables in isolation isn't nearly as important as the relationships between them.

Imagine two low-cardinality dimensions with n and m unique values respectively. Caching OLAP aggregates over those dimensions produces anywhere from n + m values to n * m values depending on how closely the relationship resembles a pure bijection. Given only the information you provided, all you can say is you'll end up with fewer than 3.64 * 10^34 values, which is not very useful.

I'm pessimistic there's an algorithm fast enough that it wouldn't make more sense to generate the cube and weigh it when you're done.




回答2:


We wrote a research paper that seems relevant:

Kamel Aouiche and Daniel Lemire, A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP, DOLAP 2007, pp. 17-24, 2007. http://arxiv.org/abs/cs.DB/0703058




回答3:


Well. You can use a general rule of Analysis Services data being about 1/4 - 1/3 size of the same data stored in relational database.

Edward.

https://social.msdn.microsoft.com/Forums/sqlserver/en-US/6b16d2b2-2913-4714-a21d-07ff91688d11/cube-size-estimation-formula



来源:https://stackoverflow.com/questions/6413773/how-to-calculate-the-likely-size-of-an-olap-cube

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!