Hadoop handling data skew in reducer

后端未结

关注

 2  849

[愿得一人] 2020-12-18 17:32

Am trying to determine if there are certain hooks available in the hadoop api (hadoop 2.0.0 mrv1) to handle data skew for a reducer. Scenario : Have a custom Composite key a

2条回答

醉酒成梦 (楼主)

2020-12-18 17:46

If you process allow it, The use of a Combiner (reduce-type function) could help you. If you pre-aggregate the data in the Mapper side . Then, even all your data end in the same reducer the amount of data could be manageable.

An alternative could be reimplement the partitioner to avoid the skew case.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...