business-intelligence

SSIS: Is It possible to have modularity and avoid replication inside a single Data Flow?

删除回忆录丶 提交于 2019-12-24 19:33:49
问题 How I can do something like programming language functions/methods inside a Data Flow? I have data flows that have many steps sequence replication and I don't want that. Using programming language as example, what I want: task1; task2; insert(); .... task13; task14; insert(); .... task60; task61; insert(); //insert implementation insert() { logtask; insertInDatabaseAtask; insertInDatabaseBtask; audittask; } I know that break my package flow in more data flows is a possibility, but in certain

Best ETL Packages In Python

孤者浪人 提交于 2019-12-24 08:00:03
问题 I have 2 use cases: Extract, Transform and Load from Oracle / PostgreSQL / Redshift / S3 / CSV to my own Redshift cluster Schedule the job do it runs daily/weekly (INSERT + TABLE or INSERT + NONE options preferable). I am currently using: SQLAlchemy for extracts (works well generally). PETL for transforms and loads (works well on smaller data sets, but for ~50m+ rows it is slow and the connection to the database(s) time out). An internal tool for the scheduling component (which stores the

Dynamic sum in dax picking distinct values

非 Y 不嫁゛ 提交于 2019-12-24 00:59:07
问题 Below is sample data Week Practice Type capacity Gen 1 BI c 80 0 1 BI c 80 1 1 BI sc 160 1 1 BI pc 240 0 1 BI pc 240 3 1 BI mc 1160 1 1 BI mc 1160 4 1 BI mc 1160 2 1 BI ac 440 1 1 BI d 40 0 1 BI d 40 3 I have a pivot chart, that has 3 slicers namely PRactice, Type, and gen. when I don't select any slicer, it should be a distinct sum(capacity) ie.,2120. Then when I click on type slicer say mc Sum(capacity) should be 1160 and click on only gen say 3 and clear other filters then sum(capacity) =

Add delivery info to query in SAP Crystal Reports

余生长醉 提交于 2019-12-24 00:46:12
问题 Below is a query linking Purchase Orders to Sales Orders. My understanding is that in order to include delivery doc # to this report, I need to add one more table - ODLN (so there would be an additional field titled "Delivery Doc#" aka [ODLN.DocNum]). My problem is I'm not sure how to join ODLN in the below query without messing anything up. ODLN.DocNum pretty much verifies that the PO did get placed at the time of the SO submission. SELECT DISTINCT o.CardName AS 'Customer Name' ,(isnull(c1

key not found error

谁说我不能喝 提交于 2019-12-23 02:32:58
问题 I have a dimension called customer, with UnknowMember = True and UnknowMemberName = NA. When I process my dimension, I see all my customers plus a NA member. I also configured ErrorConfiguration = IgnoreError and KeyErrorLimit = 100 I have a row on my fact with NULL on the customerID and the Cube fails to process with this error (its changing NULL to 0, not sure if it is expected): Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: 'dbo_FactSales',

Java Business Intelligence framework with ad-hoc web reporting? [closed]

老子叫甜甜 提交于 2019-12-22 12:27:15
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . I need a reporting framework that supports web views with ad-hoc reporting, as well as styled, canned PDF reports. My users will be non-power users, so I'll need to present something usable for the ad-hoc reporting. What's the best current solution in the java world? 回答1: There's a jasper reports plugin for

How to fit a week into a calendar time hierarchy?

本小妞迷上赌 提交于 2019-12-22 10:45:26
问题 As is usual with cubes, the users want things that don't fit into a hierarchy to be displayed hierarchically. They'd like to see Day > Week > Month > Quarter > Year as the hierarchy, but the problem with weeks is they can be part of 1-2 months, not just 1 month (and by extension part of 2 quarters, semesters, years). So my question is: how to set up the attribute relationships, and how to set up the hierarchy? Here is what I have, but I know it's not optimal. Hierarchies (cycle == weeks):

How to copy the contents of an FTP directory to a shared network path?

北战南征 提交于 2019-12-22 09:32:58
问题 I have the need to copy the entire contents of a directory on a FTP location onto a shared networked location. FTP Task has you specify the exact file name (not a directory) and File System Task does not allow accessing a FTP location. EDIT : I ended up writing a script task. 回答1: I've had some similar issues with the FTP task before. In my case, the file names changed based on the date and some other criteria. I ended up using a Script Task to perform the FTP operation. It looks like this is

Business Intelligence Development Studio 2008 installation

半世苍凉 提交于 2019-12-22 02:01:29
问题 I'm feeling like a bit of a moron, but I can't find how to install Business Intelligence Development Studio 2008. I have 2005 currently, but need to upgrade for some features. I'm pretty sure that it's included with SQL Server 2008 Standard, but I guess I could be wrong. Is it included? and if so, where do I find the install for it? 回答1: I think it's in the 'Shared Features' of the installation. http://msdn.microsoft.com/en-us/library/ms143786.aspx 回答2: yeah it's included with standard

Creating real time datawarehouse

ε祈祈猫儿з 提交于 2019-12-21 21:28:21
问题 I am doing a personal project that consists of creating the full architecture of a data warehouse (DWH). In this case as an ETL and BI analysis tool I decided to use Pentaho; it has a lot of functionality from allowing easy dashboard creation, to full data mining processes and OLAP cubes. I have read that a data warehouse must be a relational database, and understand this. What I don't understand is how to achieve a near real time, or fully real time DWH. I have read about push and pull