How does mapreduce sort and shuffle work?
问题 I am using yelps MRJob library for achieving map-reduce functionality. I know that map reduce has an internal sort and shuffle algorithm which sorts the values on the basis of their keys. So if I have the following results after map phase (1, 24) (4, 25) (3, 26) I know the sort and shuffle phase will produce following output (1, 24) (3, 26) (4, 25) Which is as expected But if I have two similar keys and different values why does the sort and shuffle phase sorts the data on the basis of first