How to tell MapReduce how many mappers to use?
问题 I am trying to speed optimize MapReduce job. Is there any way I can tell hadoop to use a particular number of mapper/reducer processes? Or, at least, minimal number of mapper processes? In the documentation, it is specified, that you can do that with the method public void setNumMapTasks(int n) of the JobConf class. That way is not obsolete, so I am starting the Job with Job class. What is the right way of doing this? 回答1: The number of map tasks is determined by the number of blocks in the