Flume HDFS sink keeps rolling small files

喜欢而已 提交于 2019-11-29 16:43:25

It seemed to be a problem with the HDFS replication factor. As I am working on a virtual machine with 1 virtual datanode I had to set the replication factor to 1 in order for it to work as expected.

Set dfs.replication on your cluster to an appropriate value. This can be done via editing hdfs-site.xml file (on all machines of cluster). However, this is not enough.

You also need to create hdfs-site.xml file on your flume classpath and put the same dfs.replication value from your cluster in it. Hadoop libraries look at this file while doing operations on the cluster, else they use default values.

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
</configuration>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!