Number of reducers in hadoop

后端 未结 4 1228
没有蜡笔的小新
没有蜡笔的小新 2021-02-20 10:35

I was learning hadoop, I found number of reducers very confusing :

1) Number of reducers is same as number of partitions.

2) Number of reducers is 0.95 or 1.75 m

4条回答
  •  醉酒成梦
    2021-02-20 11:06

    Number of reducer is internally calculated from size of the data we are processing if you don't explicitly specify using below API in driver program

    job.setNumReduceTasks(x)

    By default on 1 GB of data one reducer would be used.

    so if you are playing with less than 1 GB of data and you are not specifically setting the number of reducer so 1 reducer would be used .

    Similarly if your data is 10 Gb so 10 reducer would be used .

    You can change the configuration as well that instead of 1 GB you can specify the bigger size or smaller size.

    property in hive for setting size of reducer is :

    hive.exec.reducers.bytes.per.reducer

    you can view this property by firing set command in hive cli.

    Partitioner only decides which data would go to which reducer.

提交回复
热议问题