问题
Somewhat of an odd question, but does anyone know what kind of sort MapReduce uses in the sort portion of shuffle/sort? I would think merge or insertion (in keeping with the whole MapReduce paradigm), but I'm not sure.
回答1:
It's Quicksort, afterwards the sorted intermediate outputs get merged together. Quicksort checks the recursion depth and gives up when it is too deep. If this is the case, Heapsort is used.
Have a look at the Quicksort class:
org.apache.hadoop.util.QuickSort
You can change the algorithm used via the map.sort.class value in the hadoop-default.xml.
回答2:
To read more about it in greater depth, feel free to read about it on the post : Map-Reduce:Shuffle and sort
on my blog: Hadoop: Some Salient Understandings
来源:https://stackoverflow.com/questions/5779750/mapreduce-shuffle-sort-method