etl

How to execute scheduled SQL script on Amazon Redshift?

社会主义新天地 提交于 2019-11-28 12:38:47
I have series of ~10 queries to be executed every hour automatically in Redshift (maybe report success/failure). Most queries are aggregation on my tables. I have tried using AWS Lambda with CloudWatch Events , but Lambda functions only survive for 5 minutes max and my queries can take up to 25 minutes. systemjack It's kind of strange that AWS doesn't provide a simple distributed cron style service. It would be useful for so many things. There is SWF , but the timing/scheduling aspect is left up to the user. You could use Lambda/Cloudwatch to trigger SWF events. That's a lot of overhead to get

How to Map Input and Output Columns dynamically in SSIS?

孤街醉人 提交于 2019-11-28 12:02:15
I Have to Upload Data in SQL Server from .dbf Files through SSIS. My Output Column is fixed but the input column is not fixed because the files come from client and client may have updated data by his own style. there may be some unused column too or input column name can be different from output column. One idea I had in my mind was to map files input column with output column in SQL Database table and use only those column which is present in the row for file id. But I am not getting how to do that. can you suggest me for doing the same or else you have any idea? Table Example. +--------+---

Unzip .tar.gz files in SSIS

谁都会走 提交于 2019-11-28 11:34:44
问题 I have a .tar.gz file. Now i need to unpack these files with SSIS package. Previously did unzip and delete for .zip files with the help of For each container and script task. Not sure how to do it for .tar.gz files. Any help? 回答1: You can use an execute process task to achieve this (or using process from Script task) , but you have to install a ZIP application like 7Zip or Winzip or else. And use command line to Zip or Unzip archives. Follow one of these links for more details: Zip a folder

Reverse engineering SSIS package using C#

我怕爱的太早我们不能终老 提交于 2019-11-28 11:21:21
There is a requirement to extract source , destination and column names of source and destination . Why am I trying to do this is because I have thousands of packages and opening each package has on an average 60 to 75 of columns and listing all required info will take huge amount of time and its not a single time requirement and this task is done manually every two months in my organization currently. I'm looking for some ways to reverse engineer keeping all packages in a single folder and then go through each package and get the info and put it in some spreadsheet. I thought of opening

Format excel destination column in ssis script task

浪尽此生 提交于 2019-11-28 10:46:23
问题 Is it possible to format a column in an excel destination in ssis before generating it? I'm thinking a script task? I want to format a column to be date/time format within the excel spreadsheet 回答1: You can use Microsoft.Interop.Excel library and use NumberFormat property to change EntireColumn format to datetime. Note: you have to add Microsoft.Office.Interop.Excel.dll file to the following directories (.Net Framework dll directory) C:\Windows\Microsoft.NET\Framework\v2.0.50727 and (sql

Automate process by running excel VBA macro in SSIS

一曲冷凌霜 提交于 2019-11-28 09:32:48
Recently, I have a project need to automate a process by combining SSIS package and excel VBA macro into one. Below are the steps: I have a SSIS package exporting all the view result to multiple individual file from sql server to excel. All the files are saved in same location. I have one excel VBA macro perform cleaning to remove all the empty sheets in each exported excel file. I also have a excel VBA macro perform merging task to merge all the excel file into in master excel file. This master excel file contains all the result set and each result set saved on different tabs accordingly.

How to join multiple azure databases without rights to configure external tables?

混江龙づ霸主 提交于 2019-11-28 06:58:09
问题 In my current setup I connect to an Azure SQL Server using Authentication=Active Directory - Integrated. This method of access only allows access to a single database at a time. The architecture was migrated from an on premises SQL server environment with changes to make cloud development feasible, but still analytics and debugging must occur across databases. Typically one would simply do a cross database join with a legacy SQL Server configuration, possibly involving link servers if the

SQL Server Destination vs OLE DB Destination

有些话、适合烂在心里 提交于 2019-11-28 05:43:32
问题 I was using OLE Db destination for Bulk import of multiple Flat Files. After some tuning I ended up with SQL Server Destination to be 25 - 50 % faster. Though I am confused about this destination as there are contradictory information on the web, some are against it, some are suggesting using it. I would like to know, are there any serious pitfalls before I deploy it to production? Thanks 回答1: In this answer, I will try to provide information from official documentation of SSIS and I will

Date calculation with parameter in SSIS is not giving the correct result

前提是你 提交于 2019-11-28 05:22:16
问题 I want to load data from the last n days from a data source. To do this, I have a project parameter "number_of_days". I use the parameter in an OleDB data source with a SQL Command, with a clause WHERE StartDate >= CAST(GETDATE() -? as date) This parameter is mapped to a project parameter, an Int32. But, if I want to load the last 10 days, it is only giving me the last 8 days. Version info: SQL Server Data Tools 15.1.61710.120 Server is SQL Server 2017 standard edition. I set up a test

How to extract data from Google Analytics and build a data warehouse (webhouse) from it?

霸气de小男生 提交于 2019-11-28 04:24:42
I have click stream data such as referring URL, top landing pages, top exit pages and metrics such as page views, number of visits, bounces all in Google Analytics. There is no database yet where all this information might be stored. I am required to build a data warehouse from scratch(which I believe is known as web-house) from this data.So I need to extract data from Google Analytics and load it into a warehouse on a daily automated basis. My questions are:- 1)Is it possible? Every day data increases (some in terms of metrics or measures such as visits and some in terms of new referring