dimensional-modeling

SCD 1 dimension without surrogate key

旧城冷巷雨未停 提交于 2021-02-11 14:49:12
问题 This reference to Kimball group state that all dimensions should have surrogate keys except some very predictable one like date diemnsion. I have exactly the same case as described at SCD Type 1 Wiki page: Technically, the surrogate key is not necessary, since the row will be unique by the natural key (Supplier_Code). Data are loaded from operational system without surrogate key, while I calculating surrogate key in ETL based on single and unique xxx_code column. SCD Type 1, full load. Are

Microsoft Azure Data Warehouse: Flat Tables or Star Schema

ε祈祈猫儿з 提交于 2020-02-21 07:32:51
问题 I am creating data warehouse model on numerous OLTP tables. a) I can either utilize a Star schema or b) Flat table model table. Many people think dimensional star schema model table is not required; because most data can report itself in a single table. Additionally, star schema Kimball was created when performance and storage are an issue. Some claim with improved tech, data can be presented in a single table. Should I still separate data into dimensions/facts tables or just use the flat

How deep to go when denormalising

こ雲淡風輕ζ 提交于 2020-01-15 09:13:17
问题 I denormalising a OLTP database for use in a DWH. At the moment I am denormalising studygroups. Each studygroup has a key pointing towards 1 project. Each project has a key pointing towards 1 department. Each department has a key pointing towards 1 university. Each universityhas a key pointing to 1 city. Now I know that you are supposed to denormalize the sh*t out your OLTP but in this dwh department will be a dimension on its own. This goes for university also. Would it suffise to add a key

Star-Schema Design [closed]

别说谁变了你拦得住时间么 提交于 2020-01-09 04:01:05
问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . Is a Star-Schema design essential to a data warehouse? Or can you do data warehousing with another design pattern? 回答1: Using star schemas for a data warehouse system gets you several benefits and in most cases it is appropriate to use them for the top layer. You may also

Creating Relationships while avoiding ambiguities

丶灬走出姿态 提交于 2020-01-07 08:21:04
问题 I have a flat table like this, R# Cat SWN CWN CompBy ReqBy Department 1 A 1 1 Team A Team B Department 1 2 A 1 3 Team A Team B Department 1 3 B 1 3 Team A Team B Department 1 4 B 2 3 Team A Team C Department 1 5 B 2 3 Team D Team C Department 2 6 C 2 2 Team D Team C Department 2 R# indicates the RequestNumber, Cat# indicates the Category, SWN indicates the Submitted Week Number, CWN indicates the Completed Week Number, CompBy indicates Completed By, ReqBy Indicates Requested By, Department

How do I dimensionally model this relationship in a Kimball-style data warehouse?

倾然丶 夕夏残阳落幕 提交于 2020-01-07 00:58:11
问题 So I have two dimensions in my data warehouse: dim_machine ------------- machine_key machine_name machine_type dim_tool ------------ tool_key tool_name machine_type What I want to make sure of is the machine_type field in both dimensions has the same data. Should I create a third dimension to snowflake between the two or is there another alternative? 回答1: I'm not sure exactly what problem you're trying to solve? This sounds like something that you would simply build into the ETL process: for

Redshift Performance of Flat Tables Vs Dimension and Facts

天大地大妈咪最大 提交于 2019-12-18 12:40:09
问题 I am trying to create dimensional model on a flat OLTP tables (not in 3NF). There are people who are thinking dimensional model table is not required because most of the data for the report present single table. But that table contains more than what we need like 300 columns. Should I still separate flat table into dimensions and facts or just use the flat tables directly in the reports. 回答1: When creating tables purely for reporting purposes (as is typical in a Data Warehouse), it is

Why NULL values are mapped as 0 in Fact tables?

非 Y 不嫁゛ 提交于 2019-12-18 08:23:15
问题 What is the reason that in measure fields in fact tables (dimensionally modeled data warehouses) NULL values are usually mapped as 0? 回答1: Although you've already accepted another answer, I would say that using NULL is actually a better choice, for a couple of reasons. The first reason is that aggregates return the 'correct' answer (i.e. the one that users tend to expect) when NULL is present but give the 'wrong' answer when you use zero. Consider the results from AVG() in these two queries:

Inmon data Marts vs Kimball data marts

一个人想着一个人 提交于 2019-12-11 12:14:20
问题 Is the only difference between kimball and inmon, the Enterprise layer(EDW). I was googling around and found out that inmon also creates data marts using EDW. so does that mean, both these data marts are similar in structure for a given business process and source systems ? Once the data marts are readily available for both the procedures, do they give same performance ? correct me if i am wrong, the data warehouse is created first and then dimensional model is created on top of it for

OLAP cube design reference for a IT support business [closed]

徘徊边缘 提交于 2019-12-11 03:57:41
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 years ago . We are designing a dimensional model for an IT support business. There are cases (some call them tickets or incidents) with different statuses (feels like an SCD type II dimension) We also need to consider the count of cases and SLA time duration as measures. Before going into