Compressed file ingestion using Flume

你说的曾经没有我的故事 提交于 2020-01-03 02:28:07

问题


Can I ingest any type of compressed file ( say zip, bzip, lz4 etc.) to hdfs using Flume ng 1.3.0? I am planning to use spoolDir. Any suggesion please.


回答1:


You can ingest any type of file. You need to select an appropriate deserializer.

Below route works for compressed files. You can choose the options as you need:

agent.sources = src-1
agent.channels = c1
agent.sinks = k1

agent.sources.src-1.type = spooldir
agent.sources.src-1.channels = c1
agent.sources.src-1.spoolDir = /tmp/myspooldir
agent.sources.src-1.deserializer=org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder

agent.channels.c1.type = file

agent.sinks.k1.type = hdfs
agent.sinks.k1.channel = c1
agent.sinks.k1.hdfs.path = /user/myevents/
agent.sinks.k1.hdfs.filePrefix = events-
agent.sinks.k1.hdfs.fileType = CompressedStream
agent.sinks.k1.hdfs.round = true
agent.sinks.k1.hdfs.roundValue = 10
agent.sinks.k1.hdfs.roundUnit = minute
agent.sinks.k1.hdfs.codeC = snappyCodec



回答2:


You may leave the file uncompressed at the source and use the compression algorithms provided by Flume for compressing the data when it is ingested to HDFS. Avro sources and sinks also supports compression in-case you are planning to use them.




回答3:


I wrote custom source component and resolve. The custom source can be used to ingest any kind of file.



来源:https://stackoverflow.com/questions/18376831/compressed-file-ingestion-using-flume

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!