data-warehouse | 易学教程

How to pivot data using Informatica when you have variable amount of pivot rows?

阅读更多关于 How to pivot data using Informatica when you have variable amount of pivot rows?

Move SQL Server Database data to SAP BW

阅读更多关于 Move SQL Server Database data to SAP BW

I have read a few articles about moving data out of SAP BW and into SQL Server. I cant find any articles on moving the data from SQL Server to SAP BW, is it even possible and if so what would be the best way to handle this? After searching on this topic, i found many link addressing this issue, in this answer i will try to summarize them all and to provide all links that can help you achieving your goal. There are many way to import data from SQL Server into SAP BW: (1) SAP BW DB Connect With DB Connect, you can load data from a database system that is supported by SAP, by linking a database

Using a DATE field as primary key of a date dimension with MySQL

阅读更多关于 Using a DATE field as primary key of a date dimension with MySQL

I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world) I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER . Why don't use a DATE field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER ? Ex: CREATE TABLE dimDate id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT, date DATE NOT NULL, dayOfWeek ... VS CREATE TABLE dimDate date DATE NOT NULL PRIMARY, dayOfWeek ... If you have a table with a column that is of date type and where no two rows will ever

Is a fact table in normalized or de-normalized form?

阅读更多关于 Is a fact table in normalized or de-normalized form?

I did a bit R&D on the fact tables, whether they are normalized or de-normalized. I came across some findings which make me confused. According to Kimball : Dimensional models combine normalized and denormalized table structures. The dimension tables of descriptive information are highly denormalized with detailed and hierarchical roll-up attributes in the same table. Meanwhile, the fact tables with performance metrics are typically normalized. While we advise against a fully normalized with snowflaked dimension attributes in separate tables (creating blizzard-like conditions for the business

Is it possible to partially refresh a materialized view in Oracle?

阅读更多关于 Is it possible to partially refresh a materialized view in Oracle?

I have a very complex Oracle view based on other materialized views, regular views as well as some tables (I can't "fast refresh" it). Most of the time, existing records in this view are based on a date and are "stable", with new record sets having new dates. Occasionally, I receive back-dates. I know what those are and how to deal with them if I were maintaining a table, but I would like to keep this a "view". A complete refresh would take around 30 minutes, but it only takes 25 seconds for any given date. Can I specify that only a portion of a materialized view should be updated (i.e. the

Data warehousing principles and NoSQL

阅读更多关于 Data warehousing principles and NoSQL

with MongoDB, CouchDB and related technologies we can get faster querying so is this still valid? “A copy of transaction data, specially restructured for queries and analyses.” (R. Kimball The Data Warehouse Toolkit, 1996 I mean, do we really need to restructure our data to an OLAP scheme to query it for analysis purposes? More specifically can drill-down, slice and dice and other reporting for analysis purposes be achieved with NoSQL (NOT necessarily with OLAP modelling)? Also could we overcome the "data subset" querying limitation of OLAP and report on the whole data universe with NoSQL? In

Data warehousing principles and NoSQL

阅读更多关于 Data warehousing principles and NoSQL

问题 with MongoDB, CouchDB and related technologies we can get faster querying so is this still valid? “A copy of transaction data, specially restructured for queries and analyses.” (R. Kimball The Data Warehouse Toolkit, 1996 I mean, do we really need to restructure our data to an OLAP scheme to query it for analysis purposes? More specifically can drill-down, slice and dice and other reporting for analysis purposes be achieved with NoSQL (NOT necessarily with OLAP modelling)? Also could we

What is a staging table?

阅读更多关于 What is a staging table?

Are staging tables used only in Data warehouse project or in any SSIS Project? I would like to know what is a staging table? Can anyone give me some examples on how to use it and in what circumstances it is implemented? Also, may I please know the best practices while using it? staging tables are just database tables containing your business data in some form or other. Staging is the process of preparing your business data, usually taken from some business application. For your average BI system you have to prepare the data before loading it. A staging table is essentially just a temporary

Database architecture for millions of new rows per day

阅读更多关于 Database architecture for millions of new rows per day

问题 I need to implement a custom-developed web analytics service for large number of websites. The key entities here are: Website Visitor Each unique visitor will have have a single row in the database with information like landing page, time of day, OS, Browser, referrer, IP, etc. I will need to do aggregated queries on this database such as 'COUNT all visitors who have Windows as OS and came from Bing.com' I have hundreds of websites to track and the number of visitors for those websites range

What is the actual difference between Data Warehouse & Big Data?

阅读更多关于 What is the actual difference between Data Warehouse & Big Data?

I know what is Data Warehouse & what is Big Data. But I am confused with Data Warehouse Vs Big Data. Both are same with different names or both are different(Conceptually & Physically). I know that this is an older thread but there have been some developments in the last year or so. Comparing the data warehouse to Hadoop is like comparing apples to oranges. The data warehouse is a concept: clean, integrated data of high quality. I don't think the need for a data warehouse will go away anytime soon. Hadoop on the other hand is a technology. It is a distributed compute framework to process large