Change File Split size in Hadoop

前端 未结 4 1956
陌清茗
陌清茗 2020-12-01 00:56

I have a bunch of small files in an HDFS directory. Although the volume of the files is relatively small, the amount of processing time per file is huge. Th

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-01 01:44

    The parameter mapred.max.split.size which can be set per job individually is what you looking for. Don't change dfs.block.size because this is global for HDFS and can lead to problems.

提交回复
热议问题