azure-data-factory

What is the easiest way to pull data from a blob and load it into a table in SQL Server?

半城伤御伤魂 提交于 2021-02-19 09:07:37
问题 I have hundreds of zipped files sitting in different folders, which I can access using MS Storage Explorer. I just setup a SQL Server DB in Azure. Now I am trying to figure out how I can pull data from each file in each folder, unzip it, parse it, and load it into tables. The data is coming in daily, so the folders are named '1', '2', '3', etc. '31', for the days of the month. Also, I have monthly folders '1' through '12', for the 12 months of the year. Finally, I have folders named '2017',

What is the easiest way to pull data from a blob and load it into a table in SQL Server?

冷暖自知 提交于 2021-02-19 09:05:25
问题 I have hundreds of zipped files sitting in different folders, which I can access using MS Storage Explorer. I just setup a SQL Server DB in Azure. Now I am trying to figure out how I can pull data from each file in each folder, unzip it, parse it, and load it into tables. The data is coming in daily, so the folders are named '1', '2', '3', etc. '31', for the days of the month. Also, I have monthly folders '1' through '12', for the 12 months of the year. Finally, I have folders named '2017',

Copy and Extracting Zipped XML files from HTTP Link Source to Azure Blob Storage using Azure Data Factory

徘徊边缘 提交于 2021-02-19 08:48:05
问题 I am trying to establish an Azure Data Factory copy data pipeline. The source is an open HTTP Linked Source (Url reference: https://clinicaltrials.gov/AllPublicXML.zip). So basically the source contains a zipped folder having many XML files. I want to unzip and save the extracted XML files in Azure Blob Storage using Azure Data Factory. I was trying to follow the configurations mentioned here: How to decompress a zip file in Azure Data Factory v2 but I am getting the following error:

Azure Data Factory Copy activity/Data flow consumes all RUs in CosmosDB

心已入冬 提交于 2021-02-19 05:49:12
问题 We are using Azure Data Factory for ETL to push materialized views to our Cosmos DB instance, making sure our production Azure CosmosDB (SQL API) is containing all necessary for our users. The CosmosDB is under constant load as data flows in via speed layer as well. This is expected and currently solved with an autoscaling RU setting. We have tried these options: An ADF pipeline with Copy activity to upsert data from Azure Datalake (Gen2) (source) to the collection in Cosmos DB (sink). An ADF

how to fill down values using Azure Data Factory

半世苍凉 提交于 2021-02-11 15:15:34
问题 sorry for the basic question, I am coming from PowerQuery background, and started using ADF for a new Project. first I started wrangling data flows and fill down values is not supported, Now I am trying with mapping data flow and I can't find in the documentation how to fill down a value ? see example I have the ID column and looking to add FILL_ID 回答1: This data flow script snippet will do the trick: source1 derive(dummy = 1) ~> DerivedColumn1 DerivedColumn1 window(over(dummy), asc(movie,

How to filter timestamp column in Data Flow of Azure Data Factory

China☆狼群 提交于 2021-02-11 14:51:23
问题 I have timestamp column where I have written following expression to filter the column: contact_date >= toTimestamp('2020-01-01') && contact_date <= toTimestamp('2020-12-31') It doesn't complain about syntax but after run it doesn't filter based on date specified. Simply to say logic doesn't work. Any idea? Date Column in Dataset: 回答1: Please don't use toTimestamp() function. I tested and you will get null output. I use a Filter active to filter the data. Please use the toString() and change

How to filter timestamp column in Data Flow of Azure Data Factory

不问归期 提交于 2021-02-11 14:50:01
问题 I have timestamp column where I have written following expression to filter the column: contact_date >= toTimestamp('2020-01-01') && contact_date <= toTimestamp('2020-12-31') It doesn't complain about syntax but after run it doesn't filter based on date specified. Simply to say logic doesn't work. Any idea? Date Column in Dataset: 回答1: Please don't use toTimestamp() function. I tested and you will get null output. I use a Filter active to filter the data. Please use the toString() and change

Azure Data Factory : Set a limit to copy number of files using Copy activity

落花浮王杯 提交于 2021-02-11 14:01:10
问题 I have a copy activity used in my pipeline to copy files from Azure data Lake gen 2. The source location may have 1000's of files and the files are required to be copied but we need to set a limit for the number files required to be copied. Is there any option available in ADF to achieve the same barring a custom activity? Eg: I have 2000 files available in Data lake, but while running the pipeline i should able to pass a parameter to copy only 500 files. Regards, Sandeep 回答1: I think you can

How to call Oracle stored procedure from azure data factory v2

谁说我不能喝 提交于 2021-02-11 12:30:01
问题 My requirement is copy data from Oracle to SQL Server. Before copying from Oracle database, I need to update the Oracle table using procedure which has some logic. How do I execute Oracle stored procedure from Azure datafactory? I referred to this thread if I use EXECUTE PROC_NAME (PARAM); in preCopy script it's failing with following error Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed, Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException Message=ERROR

Azure Data Factory - MS Access as Source Database - Error

有些话、适合烂在心里 提交于 2021-02-10 20:31:03
问题 My source is 'Access Database' Dynamically generating Source query as ' Select * from <tableName> ' But I got field names with spaces in source table, and destination is of type .parquet, Data Factory pipeline is failing with below error Example if Table Employee got a column 'First Name' { "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred