Hadoop one Map and multiple Reduce

前端 未结 6 1012
陌清茗
陌清茗 2020-12-23 16:48

We have a large dataset to analyze with multiple reduce functions.

All reduce algorithm work on the same dataset generated by the s

6条回答
  •  感动是毒
    2020-12-23 17:20

    Are you expecting every reducer to work on exactly same mapped data? But at least the "key" should be different since it decides which reducer to go.

    You can write an output for multiple times in mapper, and output as key (where $i is for the i-th reducer, and $key is your original key). And you need to add a "Partitioner" to make sure these n records are distributed in reducers, based on $i. Then using "GroupingComparator" to group records by original $key.

    It's possible to do that, but not in trivial way in one MR.

提交回复
热议问题