How to increase the mappers and reducers in hadoop according to number of instances used to increase the performance?

前端 未结 4 966
说谎
说谎 2020-12-18 17:11

If I increase the number of mappers and decrease the number of reducers, then is there any difference in the performance (increase/decrease) of any job while execution?

4条回答
  •  误落风尘
    2020-12-18 17:53

    Changing number of mappers - is pure optimization which should not affect results. You should set number to fully utilize your cluster (if it is dedicated). Try number of mappers per node equal to number of cores. Look on CPU utilization, and increase the number until you get almost full CPU utilization or, you system start swapping. It might happens that you need less mappers then cores, if you have not enough memory.
    Number of reducers impacts results so , if you need specific number of reducer (like 1) - set it
    If you can handle results of any number of reducers - do the same optimization as with Mappers.
    Theoretically you can became IO bound during this tuning process - pay attention to this also when tuning number of tasks. You can recognieze it by low CPU utilization despite increase of mappers / reducers count.

提交回复
热议问题