olap

How do you design an OLAP Database?

不羁岁月 提交于 2019-12-03 01:38:56
I need a mental process to design an OLAP database... Essentially for standard relational it'd be (loosely): Identify Entities Identify Relationships Identify Properties of Entities For each property: Ensure property can be related to only one entity Ensure property is directly related to entity For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model. Identify Dimensions (or By's) These are anything that you may want to analyse/group your report by. Every table in the source

What should I have in mind when building OLAP solution from scratch?

一曲冷凌霜 提交于 2019-12-02 19:34:11
I'm working for a company running a software product based on a MS SQL database server, and through the years I have developed 20-30 quite advanced reports in PHP, taking data directly from the database. This has been very successful, and people are happy with it. But it has some drawbacks: For new changes, it can be quite development intensive The user can't experiment much with the data - it is locked to a hard-coded view It can be slow for big reports I am considering gradually going to a OLAP-based approach, which can be queried from Excel or some web-based service. But I would like to do

What is the best approach to get from relational OLTP database to OLAP cube?

谁说我不能喝 提交于 2019-12-02 18:28:35
I have a fairly standard OLTP normalised database and I have realised that I need to do some complex queries, averages, standard deviations across different dimensions in the data. So I have turned to SSAS and the creation of OLAP cubes. However to create the cubes I believe my data source structure needs to be in a 'star' or 'snowflake' configuration (which I don't think it is right now). Is the normal procedure to use SSIS to do some sort of ETL process on my primary OLTP DB into another relational DB that is in the proper 'star' configuration with facts and dimensions, and then use this DB

Benefits of using Staging Database while designing Data Warehouse

流过昼夜 提交于 2019-12-02 18:23:16
I am in process of designing a Data Warehouse Architecture. While exploring various options to Extract data from Production and putting into Data Warehouse, I came across many articles which mainly suggested following two approaches - Production DB ----> Data Warehouse (Star Schema) ----> OLAP Cube Production DB ----> Staging Database ----> Data Warehouse (Star Schema) ----> OLAP Cube I am still not sure which one is the better approach in terms of Performance and reducing processing load on Production database. Which approach you find better while designing Data Warehouse ? Below points are

Can OLAP be done in BigTable?

霸气de小男生 提交于 2019-12-02 16:56:56
In the past I used to build WebAnalytics using OLAP cubes running on MySQL. Now an OLAP cube the way I used it is simply a large table (ok, it was stored a bit smarter than that) where each row is basically a measurement or and aggregated set of measurements. Each measurement has a bunch of dimensions (i.e. which pagename, useragent, ip, etc.) and a bunch of values (i.e. how many pageviews, how many visitors, etc.). The queries that you run on a table like this are usually of the form (meta-SQL): SELECT SUM(hits), SUM(bytes), FROM MyCube WHERE date='20090914' and pagename='Homepage' and

MDX - TopCount plus 'Other' or 'The Rest' by group (over a set of members)

╄→гoц情女王★ 提交于 2019-12-02 08:30:26
问题 I've got requirement to display top 5 customer sales by customer group, but with other customers sales within the group aggregated as 'Others'. Something similar to this question, but counted separately for each of customer groups. According to MSDN to perform TopCount, over a set of members you have to use Generate function. This part works ok: with set [Top5CustomerByGroup] AS GENERATE ( [Klient].[Grupa Klientow].[Grupa Klientow].ALLMEMBERS, TOPCOUNT ( [Klient].[Grupa Klientow]

SSAS Aggregation on Distinct ID

回眸只為那壹抹淺笑 提交于 2019-12-02 07:41:10
I wish to change the default aggregation from SUM to SUM on Distinct ID Values. This is the current behaviour ID Amount 1 $10 1 $10 2 $20 3 $30 3 $30 Sum Total = $90 By default, I am getting a sum of $90. I wish to do the sum on distinct ids and get a value of $60. How would I modify the default Aggregation Behavior to achieve this result? Design your data as a many-to-many relationship: create one table/view having one record per ID and the amount column from the data shown in your question (the main fact table), and one table/view having one record per record of your data as shown in your

MDX - TopCount plus 'Other' or 'The Rest' by group (over a set of members)

别来无恙 提交于 2019-12-02 05:10:55
I've got requirement to display top 5 customer sales by customer group, but with other customers sales within the group aggregated as 'Others'. Something similar to this question , but counted separately for each of customer groups. According to MSDN to perform TopCount, over a set of members you have to use Generate function. This part works ok: with set [Top5CustomerByGroup] AS GENERATE ( [Klient].[Grupa Klientow].[Grupa Klientow].ALLMEMBERS, TOPCOUNT ( [Klient].[Grupa Klientow].CURRENTMEMBER * [Klient].[Klient].[Klient].MEMBERS , 5 , [Measures].[Przychody ze sprzedazy rzeczywiste wartosc] )

Intersection in MDX

試著忘記壹切 提交于 2019-12-01 09:30:49
I recently ran into a problem in our SQL Server 2008 Analysis Services Cube. Imagine you have a simple sales data warehouse with orders and products. Each order can be associated with several products, and each product can be contained in several orders. So the data warehouse consists out of at least 3 tables: One for the Products, one for the Orders and one for the reference table, modelling the n:n relationship between both. The question I want our cube to answer is: How many orders are there which contain both product x and product y? In SQL, this is easy: select orderid from dbo

Data warehousing principles and NoSQL

拟墨画扇 提交于 2019-12-01 00:44:38
with MongoDB, CouchDB and related technologies we can get faster querying so is this still valid? “A copy of transaction data, specially restructured for queries and analyses.” (R. Kimball The Data Warehouse Toolkit, 1996 I mean, do we really need to restructure our data to an OLAP scheme to query it for analysis purposes? More specifically can drill-down, slice and dice and other reporting for analysis purposes be achieved with NoSQL (NOT necessarily with OLAP modelling)? Also could we overcome the "data subset" querying limitation of OLAP and report on the whole data universe with NoSQL? In