data-warehouse

Informatica writes rejected rows into a bad file, how to avoid that?

自古美人都是妖i 提交于 2019-12-21 20:25:00
问题 I have developed an Informatica PowerDesigner 9.1 ETL Job which uses lookup and an update transform to detect if the target table has the the incoming rows from the source or not. I have set for the Update transform a condition IIF(ISNULL(target_table_surrogate_id), DD_INSERT, DD_REJECT) Now, when the incoming row is already in the target table, the row is rejected. Informatica writes these rejected rows into a .bad file. How to prevent this? Is there a way to determine that the rejected rows

PostgreSQL to Data-Warehouse: Best approach for near-real-time ETL / extraction of data

心已入冬 提交于 2019-12-20 10:33:35
问题 Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse. How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to

What should I have in mind when building OLAP solution from scratch?

狂风中的少年 提交于 2019-12-20 09:41:29
问题 I'm working for a company running a software product based on a MS SQL database server, and through the years I have developed 20-30 quite advanced reports in PHP, taking data directly from the database. This has been very successful, and people are happy with it. But it has some drawbacks: For new changes, it can be quite development intensive The user can't experiment much with the data - it is locked to a hard-coded view It can be slow for big reports I am considering gradually going to a

Benefits of using Staging Database while designing Data Warehouse

陌路散爱 提交于 2019-12-20 09:18:44
问题 I am in process of designing a Data Warehouse Architecture. While exploring various options to Extract data from Production and putting into Data Warehouse, I came across many articles which mainly suggested following two approaches - Production DB ----> Data Warehouse (Star Schema) ----> OLAP Cube Production DB ----> Staging Database ----> Data Warehouse (Star Schema) ----> OLAP Cube I am still not sure which one is the better approach in terms of Performance and reducing processing load on

NoSql and Data-Warehouse

旧巷老猫 提交于 2019-12-20 08:00:54
问题 What are the relations between NoSql and Data-Warehouse technologies/theories? What concepts they share? What are the basic differences between them? How do you think each could be benefits/enriches from the other? I think your ideas should be helpful for the future of both technologies. UPDATE : Some useful links: Integrating NoSQL in the Data Warehouse NoSQL and Data Warehousing Are You Ready for Big Data? 2nd UPDATE: MongoDB, BI and Non-Relational Databases 回答1: Data Warehouses have very

Move SQL Server Database data to SAP BW

早过忘川 提交于 2019-12-19 10:52:54
问题 I have read a few articles about moving data out of SAP BW and into SQL Server. I cant find any articles on moving the data from SQL Server to SAP BW, is it even possible and if so what would be the best way to handle this? 回答1: After searching on this topic, i found many link addressing this issue, in this answer i will try to summarize them all and to provide all links that can help you achieving your goal. There are many way to import data from SQL Server into SAP BW: (1) SAP BW DB Connect

Using a DATE field as primary key of a date dimension with MySQL

柔情痞子 提交于 2019-12-19 05:49:17
问题 I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world) I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER . Why don't use a DATE field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER ? Ex: CREATE TABLE dimDate id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT, date DATE NOT NULL, dayOfWeek ... VS CREATE TABLE dimDate date DATE NOT NULL PRIMARY,

Using a DATE field as primary key of a date dimension with MySQL

六眼飞鱼酱① 提交于 2019-12-19 05:49:08
问题 I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world) I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER . Why don't use a DATE field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER ? Ex: CREATE TABLE dimDate id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT, date DATE NOT NULL, dayOfWeek ... VS CREATE TABLE dimDate date DATE NOT NULL PRIMARY,

Is it possible to partially refresh a materialized view in Oracle?

 ̄綄美尐妖づ 提交于 2019-12-19 05:21:15
问题 I have a very complex Oracle view based on other materialized views, regular views as well as some tables (I can't "fast refresh" it). Most of the time, existing records in this view are based on a date and are "stable", with new record sets having new dates. Occasionally, I receive back-dates. I know what those are and how to deal with them if I were maintaining a table, but I would like to keep this a "view". A complete refresh would take around 30 minutes, but it only takes 25 seconds for

In a star schema, are foreign key constraints between facts and dimensions neccessary?

守給你的承諾、 提交于 2019-12-18 16:28:19
问题 I'm getting my first exposure to data warehousing, and I’m wondering is it necessary to have foreign key constraints between facts and dimensions. Are there any major downsides for not having them? I’m currently working with a relational star schema. In traditional applications I’m used to having them, but I started to wonder if they were needed in this case. I’m currently working in a SQL Server 2005 environment. UPDATE: For those interested I came across a poll asking the same question. 回答1