Architecture for database analytics

三世轮回 提交于 2019-12-02 23:06:34
Ryan Cox

You will want to google Star Schema. The basic idea is to model a special data warehouse / OLAP instance of your existing OLTP system in a way that is optimized to provided the type of aggregations you describe. This instance will be comprised of facts and dimensions.

In the example below, sales 'facts' are modeled to provide analytics based on customer, store, product, time and other 'dimensions'.

You will find Microsoft's Adventure Works sample databases instructive, in that they provide both the OLTP and OLAP schemas along with representative data.

There are special db's for analytics like Greenplum, Aster data, Vertica, Netezza, Infobright and others. You can read about those db's on this site: http://www.dbms2.com/

The canonical handbook on Star-Schema style data warehouses is Raplh Kimball's "The Data Warehouse Toolkit" (there's also the "Clickstream Data Warehousing" in the same series, but this is from 2002 I think, and somewhat dated, I think that if there's a new version of the Kimball book it might serve you better. If you google for "web analytics data warehouse" there are a bunch of sample schema available to download & study.

On the other hand, a lot of the no-sql that happens in real life is based around mining clickstream data, so it might be worth see what the Hadoop/Cassandra/[latest-cool-thing] community has in the way of case studies to see if your use case matches well with what they can do.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!