Flume HDFS Sink generates lots of tiny files on HDFS

后端 未结 3 1270
北恋
北恋 2021-01-24 18:14

I have a toy setup sending log4j messages to hdfs using flume. I\'m not able to configure the hdfs sink to avoid many small files. I thought I could configure the hdfs sink to

3条回答
  •  天涯浪人
    2021-01-24 18:20

    HDFS Sink has a property hdfs.batchSize (default 100) which describes "number of events written to file before it is flushed to HDFS". I think that's your problem here.

    Consider also checking all other properties: HDFS Sink .

提交回复
热议问题