Hadoop handling data skew in reducer

后端 未结 2 849
[愿得一人]
[愿得一人] 2020-12-18 17:32

Am trying to determine if there are certain hooks available in the hadoop api (hadoop 2.0.0 mrv1) to handle data skew for a reducer. Scenario : Have a custom Composite key a

2条回答
  •  醉酒成梦
    2020-12-18 17:46

    If you process allow it, The use of a Combiner (reduce-type function) could help you. If you pre-aggregate the data in the Mapper side . Then, even all your data end in the same reducer the amount of data could be manageable.

    An alternative could be reimplement the partitioner to avoid the skew case.

提交回复
热议问题