Sorted word count using Hadoop MapReduce

后端未结

关注

 4  973

鱼传尺愫 2020-12-16 02:37

I\'m very much new to MapReduce and I completed a Hadoop word-count example.

In that example it produces unsorted file (with key-value pairs) of word counts. So is i

4条回答

无人及你 (楼主)

2020-12-16 03:12

In Hadoop sorting is done between the Map and the Reduce phases. One approach to sort by word occurance would be to use a custom group comparator that doesn't group anything; therefore, every call to reduce is just the key and one value.

public class Program {
   public static void main( String[] args) {

      conf.setOutputKeyClass( IntWritable.class);
      conf.setOutputValueClass( Text.clss);
      conf.setMapperClass( Map.class);
      conf.setReducerClass( IdentityReducer.class);
      conf.setOutputValueGroupingComparator( GroupComparator.class);   
      conf.setNumReduceTasks( 1);
      JobClient.runJob( conf);
   }
}

public class Map extends MapReduceBase implements Mapper {

   public void map( Text key, IntWritable value, OutputCollector, Reporter reporter) {
       output.collect( value, key);
   }
}

public class GroupComaprator extends WritableComparator {
    protected GroupComparator() {
        super( IntWritable.class, true);
    }

    public int compare( WritableComparable w1, WritableComparable w2) {
        return -1;
    }
}

0 讨论(0)

查看其它4个回答