azure-data-factory-2

transform data in azure data factory using python data bricks

五迷三道 提交于 2021-02-11 12:31:29
问题 I have the task to transform and consolidate millions of single JSON file into BIG CSV files. The operation would be very simple using a copy activity and mapping the schemas, I have already tested, the problem is that a massive amount of files have bad JSON format. I know what is the error and the fix is very simple too, I figured that I could use a Python Data brick activity to fix the string and then pass the output to a copy activity that could consolidate the records into a big CSV file.

Azure Data Factory Pipeline Consumption Details

社会主义新天地 提交于 2021-02-11 12:23:42
问题 I got a Azure Data Factory provisioned and is being shared by different department for their piece of work. Orchestration Framework is common. But individual pipelines are specific to the departments needs. Now it's getting hard to split the bill between agencies. How can I get the Consumption (DIU's) details from ADF by pipeline to split the bill. Or is there better way to do this? 回答1: Like you said, It's very hard to split the bill. Just for now, there isn't a better way to do this. The

Azure Data Factory - MS Access as Source Database - Error

有些话、适合烂在心里 提交于 2021-02-10 20:31:03
问题 My source is 'Access Database' Dynamically generating Source query as ' Select * from <tableName> ' But I got field names with spaces in source table, and destination is of type .parquet, Data Factory pipeline is failing with below error Example if Table Employee got a column 'First Name' { "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred

Log Azure Data Factory Pipeline run events to Azure App Insights

牧云@^-^@ 提交于 2021-02-10 17:10:52
问题 Is there a way to publish ADF pipeline run events with status to App Insights? 回答1: Per my knowledge, you could use Web Activity in the ADF to invoke the Application Insights REST API after execution of your main activities(Or using Execute Pipeline Activity to execute your root pipeline and get the status or output of it).Then send it to App Insights REST API. More details,please refer to this document:https://www.ben-morris.com/using-azure-data-factory-with-the-application-insights-rest-api

Log Azure Data Factory Pipeline run events to Azure App Insights

感情迁移 提交于 2021-02-10 17:06:17
问题 Is there a way to publish ADF pipeline run events with status to App Insights? 回答1: Per my knowledge, you could use Web Activity in the ADF to invoke the Application Insights REST API after execution of your main activities(Or using Execute Pipeline Activity to execute your root pipeline and get the status or output of it).Then send it to App Insights REST API. More details,please refer to this document:https://www.ben-morris.com/using-azure-data-factory-with-the-application-insights-rest-api

Data Factory Childitem modified or created date

穿精又带淫゛_ 提交于 2021-02-10 14:33:24
问题 I have a Data Factory V2 pipeline consisting of 'get metadata' and 'forEach' activities that reads a list of files on a file share (on-prem) and logs it in a database table. Currently, I'm only able to read file name, but would like to also retrieve the date modified and/or date created property of each file. Any help, please? Thank you 回答1: According to the MS documentation. We can see File system and SFTP both support the lastModified property. But we only can get the lastModified of one

Data Factory Childitem modified or created date

雨燕双飞 提交于 2021-02-10 14:31:14
问题 I have a Data Factory V2 pipeline consisting of 'get metadata' and 'forEach' activities that reads a list of files on a file share (on-prem) and logs it in a database table. Currently, I'm only able to read file name, but would like to also retrieve the date modified and/or date created property of each file. Any help, please? Thank you 回答1: According to the MS documentation. We can see File system and SFTP both support the lastModified property. But we only can get the lastModified of one

How to decompress a zip file in Azure Data Factory v2

一世执手 提交于 2021-02-05 11:40:59
问题 I'm trying to decompress a zip file (with multiple files inside) using Azure Data Factory v2. The zip file is located in Azure File Storage. The ADF Copy task just copies the original zip file without decompressing it. Any suggestion on how to make this work? This is the current configuration: The zip file source was setup as a binary dataset with Compression Type = ZipDeflate. The target folder was also setup as a binary dataset but with Compression Type = None. A pipeline with a single Copy

Error calling the azure function endpoint from azure data factory

眉间皱痕 提交于 2021-02-05 09:30:07
问题 I have linked azure function in data factory pipeline which writes the text file to blob storage The azure function works fine when executed independently and writes the file to blob storage But i am facing below mentioned error when i run the azure function from data factory { "errorCode": "3600", "message": "Error calling the endpoint.", "failureType": "UserError", "target": "Azure Function1" } I have configured the azure fucntion to access the blob with blobendpoint and shared access

Azure datafactory v2 Execute Pipeline with For Each

末鹿安然 提交于 2021-01-29 19:57:48
问题 I am trying to use "Execute Pipeline" to invoke a Pipe which has a ForEach activity. I get an error. Json for Execute pipe: [ { "name": "pipeline3", "properties": { "activities": [ { "name": "Test_invoke1", "type": "ExecutePipeline", "dependsOn": [], "userProperties": [], "typeProperties": { "pipeline": { "referenceName": "MAIN_SA_copy1", "type": "PipelineReference" }, "waitOnCompletion": true } } ], "annotations": [] } } ] Jason for Invoke pipe for each activity : [ { "name": "MAIN_SA_copy1"