发表新帖

发表新帖

What is the use of grouping comparator in hadoop map reduce

后端未结

关注

 4  748

感情败类 2020-12-01 00:09

I would like to know why grouping comparator is used in secondary sort of mapreduce.

According to the definitive guide example of secondary sorting

We want t

4条回答

广开言路 (楼主)

2020-12-01 01:02

Let me improve the statement "... take care of the map output keys going to particular reducer".

Reducer Instance vs reduce method: One JVM is created per Reduce task and each of these has a single instance of the Reducer class.This is Reducer instance(I call it Reducer from now).Within each Reducer, reduce method is called multiple times depending on 'key grouping'.Each time reduce is called, 'valuein' has a list of map output values grouped by the key you define in 'grouping comparator'.By default, grouping comparator uses the entire map output key.

In the example, map output key is changed to 'year and temperature' to achieve sorting.Unless you define a grouping comparator that uses only the 'year' part of the map output key,you can't make all records of the same year go to same reduce method call.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题