Finding biggest value for key

前端 未结 1 1554
长发绾君心
长发绾君心 2021-01-07 04:23

I want to find out the largest country with greatest area.

my data set is as follows

Afghanistan 648
Albania 29
Algeria 2388
Andorra 0
Austria 84
Bah         


        
1条回答
  •  感情败类
    2021-01-07 04:53

    The algorithm is easy, in the mapper you gather the max and at the end of your mapper you write it to disk using cleanup.

    int max = Integer.MIN_VALUE;
    String token;
    
    @Override
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String[] tokens = value.toString().split(",");
            if(Integer.parseInt(tokens[2]) == 1){       
                int val = Integer.parseInt(tokens[3])
                if(Integer.parseInt(tokens[3]) > max){
                    max = val;
                    token = tokens[0];
                }
            }
    }
    
    @Override
    public void cleanup(Context context) throws IOException, InterruptedException {    
        context.write(new LongWritable(max), new Text(token));    
    }
    

    All your stuff now get's reduced on the max, which means if we sort descending, you get the maximum as the first record in the reducer. Therefore you need to set this in your job:

    job.setSortComparatorClass(LongWritable.DecreasingComparator.class);
    

    The reducer is a simply found/not-found switch that just outputs every country if it has the maximum value (first record).

    boolean foundMax = false;
    
    @Override
    public void reduce(LongWritable key, Iterable values, Context context) throws IOException, InterruptedException{
            if(!foundMax){
                for(Text t : values){
                    context.write(t, key);
                }
                foundMax = true;
            }              
    }
    

    0 讨论(0)
提交回复
热议问题