data-warehouse

How can one build the TFS cube from scratch?

梦想与她 提交于 2019-12-23 05:24:02
问题 We are having issues with the TFS cube. I don't think it has been built since TFS was installed. The warehouse seems to be working and has new data it just seems to be the cube that doesn't work. We tried rebuilding it using the TFS Administrator Console but that made things worse, the data that was in there was erased and replaced by what looks like a blank Database. I tried deleting the Database so that I could see if the cube was actually being built but now when I run the rebuild it says

star schema design - one column dimensions

时光毁灭记忆、已成空白 提交于 2019-12-23 03:19:27
问题 I`m new to data warehousing, but I think my question can be relatively easy answered. I built a star schema, with a dimension table 'product'. This table has a column 'PropertyName' and a column 'PropertyValue'. The dimension therefore looks a little like this: surrogate_key | natural_key (productID) | PropertyName | PropertyValue | ... 1 5 Size 20 ... 2 5 Color red 3 6 Size 20 4 6 Material wood and so on. In my fact table I always use the surrogate keys of the dimensions. Cause of the

ETL Operation - Return Primary Key

你说的曾经没有我的故事 提交于 2019-12-23 03:17:24
问题 I am using Talend to populate a data warehouse. My job is writing customer data to a dimension table and transaction data to the fact table. The surrogate key (p_key) on the fact table is auto-incrementing. When I insert a new customer, I need my fact table to reflect the id of the related customer. As I mentioned my p_key is auto auto_incrementing so I can't just insert an arbitrary value for the p_key. Any thought on how I can insert a row into my dimension table and still retrieve the

How to connect a fact and dimension table that are in 1-N relationship

只谈情不闲聊 提交于 2019-12-23 03:03:15
问题 I have a Purchase FactTable with some measures and dimension keys. Then, there's another another table: Discount Table. Purchase FactTable is in a 1-N relationship with Discount Table (for each purchase I might have bought several discounted items). Discount table has some attributes (description, note) and some numeric values (for example: discount in $) that I would like to roll-up. If I create a dimension out of this Discount Table, I'll get a wrong number of purchase counts in a sum count

Database design for incremental “export” to data warehouse

谁说我不能喝 提交于 2019-12-23 02:54:07
问题 Given a 1 TB relational database, currently in SQL Server. The data warehouse needs a "copy" of major parts of the database. The warehouse data should not be more than 24 hours old. The size of the relational database makes it impractical to do a full load every night. How should I design my relational database to support incremental load to the warehouse? A very small portion (<0.1%) of the database changes in a single day, and it is mostly inserts. The intra-day changes are not required,

Nulls in dimension table for numeric attributes

淺唱寂寞╮ 提交于 2019-12-22 13:58:58
问题 What is the best way to handle missing values in a dimension table? In the case of a textual column, it is easy to write "NA: Missing," but what should be done for numeric columns where it is important to retain the specific values . Note: I do not want a solution that uses banded values (e.g., textual columns for "0-50", "50-100", "NA: Missing"). For instance, a customer dimension may have a year-of-birth. How should missing years of birth be handled? Leave it null? Add in an arbitrary

SSAS Dimension attribute as Calculated Measure

浪子不回头ぞ 提交于 2019-12-22 10:44:28
问题 I am having some issues trying to implement an average of a dimension attribute. The basic structure is: Booking Header Dimension Fact Table (multiple rows per Booking Header entry) On the booking header dimension I have a numerical attribute called Booking Window, and I want to be able to create a calculated measure that averages this value. We are using SQL Server 2012 standard edition. Any help would be greatly appreciated. 回答1: The best approach would be to create a measure group from the

Time-based drilldowns in Power BI powered by Azure Data Warehouse

血红的双手。 提交于 2019-12-22 07:57:29
问题 I have designed a simple Azure Data Warehouse where I want to track stock of my products on periodic basis. Moreover I want to have an ability to see that data grouped by month, weeks, days and hours with ability to drill down from top to bottom. I have defined 3 dimensions: DimDate DimTime DimProduct I have also defined a Fact table to track product stocks: FactStocks - DateKey (20160510, 20160511, etc) - TimeKey (0..23) - ProductKey (Product1, Product2) - StockValue (number, 1..9999) My

Strategies for populating a Reporting/Data Warehouse database

旧巷老猫 提交于 2019-12-21 22:47:28
问题 For our reporting application, we have a process that aggregates several databases into a single 'reporting' database on a nightly basis. The schema of the reporting database is quite different than that of the separate 'production' databases that we are aggregating so there is a good amount of business logic that goes into how the data is aggregated. Right now this process is implemented by several stored procedures that run nightly. As we add more details to the reporting database the logic

Creating real time datawarehouse

ε祈祈猫儿з 提交于 2019-12-21 21:28:21
问题 I am doing a personal project that consists of creating the full architecture of a data warehouse (DWH). In this case as an ETL and BI analysis tool I decided to use Pentaho; it has a lot of functionality from allowing easy dashboard creation, to full data mining processes and OLAP cubes. I have read that a data warehouse must be a relational database, and understand this. What I don't understand is how to achieve a near real time, or fully real time DWH. I have read about push and pull