Computing median in map reduce

后端 未结 4 1898
礼貌的吻别
礼貌的吻别 2020-12-14 20:33

Can someone example the computation of median/quantiles in map reduce?

My understanding of Datafu\'s median is that the \'n\' mappers sort the data and send the da

4条回答
  •  遥遥无期
    2020-12-14 21:09

    O((n log n)/p) to sort it then O(1) to get the median.

    Yes... you can get O(n/p) but you can't use the out of the box sort functionality in Hadoop. I would just sort and get the center item unless you can justify the 2-20 hours of development time to code the parallel kth largest algorithm.

提交回复
热议问题