How to decompress a zip file in Azure Data Factory v2

一世执手 提交于 2021-02-05 11:40:59

问题


I'm trying to decompress a zip file (with multiple files inside) using Azure Data Factory v2. The zip file is located in Azure File Storage. The ADF Copy task just copies the original zip file without decompressing it. Any suggestion on how to make this work?

This is the current configuration:

  1. The zip file source was setup as a binary dataset with Compression Type = ZipDeflate.
  2. The target folder was also setup as a binary dataset but with Compression Type = None.
  3. A pipeline with a single Copy task was created to move files from zip file to target folder.

回答1:


This can be achieved by having a setting "ZipDeflate" compression type in your source data set and in the sink data set of Copy activity you don't need to specify any compression configuration (Compression type is "none").

In the Copy activity sink settings, please set the copy behavior to "Flatten Hierarchy" to unzip and write the individual files.

When the Copy behavior is set to "Flatten Hierarchy", all the files from zipped source file are extracted and written to destination folder mentioned in the sink dataset as individual files by renaming the files to data_SomeGUID.csv.

In case if you do not specify the copy behavior (set to "none") in copy activity, then it decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: //.

Please refer to this doc to know about the Compression support in Azure data factory: https://docs.microsoft.com/azure/data-factory/supported-file-formats-and-compression-codecs-legacy#compression-support




回答2:


If you don't want to lose the names of the files within your zip, use the Copy activity but set the Copy Behavior to "Preserve hierarchy". This will create a folder with the name of your zip file, and the files will be inside with their original names.

Zip Copy Behavior



来源:https://stackoverflow.com/questions/57261025/how-to-decompress-a-zip-file-in-azure-data-factory-v2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!