问题
I'm new to Apache NIFI and trying to build a flow as a POC. I need your guidance for the same.
I have a compressed 'gz' file say 'sample.gz' containing a file say 'sample_file'. I need to decompress the sample.gz file and store 'sample_file' in a hdfs location.
I'm using GetFile processor to get the sample.gz file, CompressContent processor in decompress mode to decompress the same file and PutHDFS processor to put the decompressed file in HDFS location.
After running the flow, I can find that the original sample.gz file is only copied to HDFS location whereas I needed to copy the sample_file inside the gz file. So decompressing has actually not worked for me.
I hope I could explain the issue I'm facing. Please suggest if I need to change my approach.
回答1:
I used the same sequence of processors but changed PutHDFS
to PutFile
.
GetFile --> CompressContent(decompress) --> PutFile
In nifi v1.3.0 it works fine.
The only note: if I keep the parameter Update Filename = false
the for CompressContent
then the filename
attribute remains the same after decompression as before (sample.gz
).
But the content is decompressed.
So, if your question about the filename then:
- you can change by setting parameter
Update Filename = true
inCompressContent
processor. in this casesample.gz
will be changed tosample
during decompression. - use
UpdateAttribute
processor to change thefilename
attribute
来源:https://stackoverflow.com/questions/44652827/apache-nifi-decompression