When do reduce tasks start in Hadoop?

前端 未结 8 842
深忆病人
深忆病人 2020-11-27 10:04

In Hadoop when do reduce tasks start? Do they start after a certain percentage (threshold) of mappers complete? If so, is this threshold fixed? What kind of threshold is typ

8条回答
  •  Happy的楠姐
    2020-11-27 10:11

    Reducer tasks starts only after the completion of all the mappers.

    But the data transfer happens after each Map. Actually it is a pull operation.

    That means, each time reducer will be asking every maptask if they have some data to retrive from Map.If they find any mapper completed their task , Reducer pull the intermediate data.

    The intermediate data from Mapper is stored in disk. And the data transfer from Mapper to Reduce happens through Network (Data Locality is not preserved in Reduce phase)

提交回复
热议问题