What is the use of grouping comparator in hadoop map reduce

后端 未结 4 742
感情败类
感情败类 2020-12-01 00:09

I would like to know why grouping comparator is used in secondary sort of mapreduce.

According to the definitive guide example of secondary sorting

We want t

4条回答
  •  南方客
    南方客 (楼主)
    2020-12-01 01:03

    In support of the chosen answer I add:

    Following on from this explanation

    **Input**:
    
        symbol time price
        a      1    10
        a      2    20
        b      3    30
    
    **Map output**: create composite key\values like so:
    
    > symbol-time time-price
    >
    >**a-1**         1-10
    >
    >**a-2**         2-20
    >
    >**b-3**         3-30
    

    The Partitioner: will route the a-1 and a-2 keys to the same reducer despite the keys being different. It will also route the b-3 to a separate reducer.

    GroupComparator: once the composites key\value arrive at the reducer instead of the reducer getting

    >(**a-1**,{1-10})
    >
    >(**a-2**,{2-20})
    

    the above will happen due to the unique key values following composition.

    the group comparator will ensure the reducer gets:

    (a-1,{**1-10,2-20**})
    

    The key of the grouped values will be the one which comes first in the group. This can be controlled by Key comparator.

    **[[In a single reduce method call.]]**
    

提交回复
热议问题