Apache Nifi decompression

你离开我真会死。 提交于 2021-01-27 21:18:29

问题


I'm new to Apache NIFI and trying to build a flow as a POC. I need your guidance for the same.

I have a compressed 'gz' file say 'sample.gz' containing a file say 'sample_file'. I need to decompress the sample.gz file and store 'sample_file' in a hdfs location.

I'm using GetFile processor to get the sample.gz file, CompressContent processor in decompress mode to decompress the same file and PutHDFS processor to put the decompressed file in HDFS location.

After running the flow, I can find that the original sample.gz file is only copied to HDFS location whereas I needed to copy the sample_file inside the gz file. So decompressing has actually not worked for me.

I hope I could explain the issue I'm facing. Please suggest if I need to change my approach.


回答1:


I used the same sequence of processors but changed PutHDFS to PutFile.

GetFile --> CompressContent(decompress) --> PutFile

In nifi v1.3.0 it works fine.

The only note: if I keep the parameter Update Filename = false the for CompressContent then the filename attribute remains the same after decompression as before (sample.gz).

But the content is decompressed.

So, if your question about the filename then:

  1. you can change by setting parameter Update Filename = true in CompressContent processor. in this case sample.gz will be changed to sample during decompression.
  2. use UpdateAttribute processor to change the filename attribute


来源:https://stackoverflow.com/questions/44652827/apache-nifi-decompression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!