Azure Data Factory Get Metadata to get blob filenames and transfer them to Azure SQL database table

心不动则不痛 提交于 2020-12-15 08:34:51

问题


I am trying to use Get Metadata activity in Azure Data Factory in order to get blob filenames and copy them to Azure SQL database table. I follow this tutorial: https://www.mssqltips.com/sqlservertip/6246/azure-data-factory-get-metadata-example/

Here is my pipeline, Copy Data > Source is the source destination of the blob files in my Blob storage. I need to specify my source file as binary because they are *.jpeg files.

For my Copy Data > Sink, its the Azure SQL database, I enable the option "Auto Create table"

In my Sink dataset config, I had to choose one table because the validation won't pass if I don't select the table in my SQL database even though this table is not related at all to the blob filenames that I want to get.

Question 1: Am I supposed to create a new table in SQL DB before to have the columns matching the blob filenames that I want to extract?

Then, I tried to validate the pipeline and I get this error.

Copy_Data_1
Sink must be binary when source is binary dataset.

Question 2: How can I resolve this error? I had to select the file type of the source as binary as it's one of the step when creating source dataset. Therefore, when I choose sink dataset that is Azure SQL table, I didn't have to select the type of dataset so it doesn't seem to match.

Thank you very much in advance.

New screenshot of the new pipeline, I can now get itemName of filenames in the json output files.

Now I add Copy Data activity just after Get_File_Name2 activity and connect them together to try to get the json output files as source dataset.

However, I need to choose the source dataset location first before specify type as json. But, as far as I understand these output json files are the output from Get_File_Name2 activity and they are not yet stored on Blob storage. How do I make the copy data activity reading these json output file as source dataset?

Update 10/14/2020 Here is my new activity stored procedure, I added the parameter as suggested however, I changed the name to JsonData as my stored procedure requires this parameter.

This is my stored procedure.

I get this error at the stored procedure:

{
    "errorCode": "2402",
    "message": "Execution fail against sql server. Sql error number: 13609. Error Message: JSON text is not properly formatted. Unexpected character 'S' is found at position 0.",
    "failureType": "UserError",
    "target": "Stored procedure1",
    "details": []
}

But when I check the input, it seems like it already successfully reading the json string itemName.

But, when I check output, it's not there.


回答1:


Actually, you may could using Get metadata output json as the parameter and then call the stored procedure: Get metedata-->Stored Procedure!

You just need focus on the coding of the stored procedure.

Get Metadata output childitems:

{
   "childItems": [
        {
            "name": "DeploymentFiles.zip",
            "type": "File"
        },
        {
            "name": "geodatalake.pdf",
            "type": "File"
        },
        {
            "name": "test2.xlsx",
            "type": "File"
        },
        {
            "name": "word.csv",
            "type": "File"
        }
}

Stored Procedure:

@activity('Get Metadata1').output.childitems

About how to create the stored procedure(get data from json object), you could ref this blog: Retrieve JSON Data from SQL Server using a Stored Procedure.



来源:https://stackoverflow.com/questions/64227251/azure-data-factory-get-metadata-to-get-blob-filenames-and-transfer-them-to-azure

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!