data-warehouse | 易学教程

MERGE - conditional “WHEN MATCHED THEN UPDATE”

阅读更多关于 MERGE - conditional “WHEN MATCHED THEN UPDATE”

问题 The highlights in the image below shows the logic I want to implement. I realize the syntax is incorrect. Is there a way to conditionally update a record in a MERGE statement only if it the value of one of its columns in the target table is NULL, and the corresponding value in the source table is not null? How would you suggest re-writing this? MERGE dbo.input_311 AS [t] USING dbo.input_311_staging AS [s] ON ([t].[unique key] = [s].[unique key]) WHEN NOT MATCHED BY TARGET THEN INSERT(t.

Data Warehousing - Star Schema vs Flat Table

阅读更多关于 Data Warehousing - Star Schema vs Flat Table

问题 I'm trying to design a Data Warehouse for a single store of commonly required data ranging from finance systems, project scheduling systems and a myriad of scientific systems. I.e. many different data marts. I have been reading up on Data Warehousing and popular methods such as Star Schemas and Kimball methods etc but one question I cannot find answer to is: Why is it better to design your DW Data Mart as a star schema rather than a single flat table? Surely having no joins between facts and

SCD 1 dimension without surrogate key

阅读更多关于 SCD 1 dimension without surrogate key

问题 This reference to Kimball group state that all dimensions should have surrogate keys except some very predictable one like date diemnsion. I have exactly the same case as described at SCD Type 1 Wiki page: Technically, the surrogate key is not necessary, since the row will be unique by the natural key (Supplier_Code). Data are loaded from operational system without surrogate key, while I calculating surrogate key in ETL based on single and unique xxx_code column. SCD Type 1, full load. Are

GDPR: encyption at-rest instead of data lookup tables [closed]

阅读更多关于 GDPR: encyption at-rest instead of data lookup tables [closed]

问题 Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 12 days ago . Improve this question Encryption at-rest - is storing data inside your storage/database in encrypted format. During processing you need to decrypt data every time, calculate something and then encrypt everything back (encryption is managed by storage). Does encryption at-rest resolve

bigquery aggregate for daily basis

阅读更多关于 bigquery aggregate for daily basis

问题 I have a table in big-query (datawarehouse): and I would like to have the result of: Here is the explanation on how the calculation should be: 2017-10-01 = $100 is obvious, because the data is only one 2017-10-02 = $400 is a sum of the first row and third row. Why? Because second row and third row have the same invoice. So we only use the latest update. 2017-10-04 = $800 is a sum of row 1,3, and 4. Why? It is because we only take one invoice only per day. row 1 (T001), row 3(T002), row 4(T003

Data warehouse schema: is it OK to directly link fact tables in DWH?

阅读更多关于 Data warehouse schema: is it OK to directly link fact tables in DWH?

问题 Is it OK to directly link fact tables in DWH? As I understand, in galaxy schema fact tables are not linked, they just have common dimension table. But, if there is a DWH schema that assumes to link them directly? 回答1: IMO, they shouldn’t, even if they can. Fact tables are usually huge, with potentially many billions of rows, and hold measures at a certain grain. Linking two or more fact tables may require joining several multi billion row tables which will be too expensive. If you need to

Managing surrogate keys in a data warehouse

阅读更多关于 Managing surrogate keys in a data warehouse

问题 I want to build a data warehouse, and I want to use surrogate keys as primary keys for my fact tables. But the problem is that in my case fact tables should be updated. The first question is how do I find a corresponding auto-generated surrogate key for the natural key in the source system? I have seen some answers mentioning lookup tables which store correspondence between natural and surrogate keys, but I didn't understand how exactly they are implemented. Where this table should be stored:

collecting annual aggregated data for later quick access

阅读更多关于 collecting annual aggregated data for later quick access

问题 I have a number of sql queries which take year as a parameter and generate various annual reports for the given year. Those queries are quite cumbersome and take a considerable amount of time to execute (20 min - 40 min). In order to give my users the ability to view annual report whenever they need to, I am considering to pre-execute these queries and store the results for later use. One solution would be to schedule execution of these queries and insert the results in some temp tables. But

collecting annual aggregated data for later quick access

阅读更多关于 collecting annual aggregated data for later quick access

collecting annual aggregated data for later quick access

阅读更多关于 collecting annual aggregated data for later quick access