问题
Can I ingest any type of compressed file ( say zip, bzip, lz4 etc.) to hdfs using Flume ng 1.3.0? I am planning to use spoolDir. Any suggesion please.
回答1:
You can ingest any type of file. You need to select an appropriate deserializer.
Below route works for compressed files. You can choose the options as you need:
agent.sources = src-1
agent.channels = c1
agent.sinks = k1
agent.sources.src-1.type = spooldir
agent.sources.src-1.channels = c1
agent.sources.src-1.spoolDir = /tmp/myspooldir
agent.sources.src-1.deserializer=org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder
agent.channels.c1.type = file
agent.sinks.k1.type = hdfs
agent.sinks.k1.channel = c1
agent.sinks.k1.hdfs.path = /user/myevents/
agent.sinks.k1.hdfs.filePrefix = events-
agent.sinks.k1.hdfs.fileType = CompressedStream
agent.sinks.k1.hdfs.round = true
agent.sinks.k1.hdfs.roundValue = 10
agent.sinks.k1.hdfs.roundUnit = minute
agent.sinks.k1.hdfs.codeC = snappyCodec
回答2:
You may leave the file uncompressed at the source and use the compression algorithms provided by Flume for compressing the data when it is ingested to HDFS. Avro sources and sinks also supports compression in-case you are planning to use them.
回答3:
I wrote custom source component and resolve. The custom source can be used to ingest any kind of file.
来源:https://stackoverflow.com/questions/18376831/compressed-file-ingestion-using-flume