Combiner Implementation and internal working

前端 未结 3 827
无人共我
无人共我 2020-12-22 07:45

I want to use a combiner in my MR code say WordCount.

How should I implement it?

What sort of data is being passed to the reducer from the combiner?

3条回答
  •  没有蜡笔的小新
    2020-12-22 08:14

    A Combiner, also known as a semi-reducer.

    The main function of a Combiner is to summarize the map output records with the same key.

    The Combiner class is used in between the Map class and the Reduce class to reduce the volume of data transfer between Map and Reduce

    Explanation with sample code.

    MAP Input:

    What do you mean by Object
    What do you know about Java
    What is Java Virtual Machine
    How Java enabled High Performance
    

    MAP output

         
         
        
        
    

    This MAP output will be passed as input to Combiner.

    Combiner output

            
         
         
          
    

    This combiner output is passed as input to Reducer.

    Reducer Output

            
         
         
       How,1>    
    

    If you are using java, below code will set Combiner & Reducer to same class, which is ideal.

      job.setJarByClass(WordCount.class);
      job.setMapperClass(TokenizerMapper.class);
      job.setCombinerClass(IntSumReducer.class);
      job.setReducerClass(IntSumReducer.class);
    

    Have a look at working example in java @tutorialspoint

提交回复
热议问题