问题
What is the best approach to get the Map Output keys to a reducer in reverse order? By default the reducer receives all keys in ascending order of keys. Any help or comments widely appreciated.
In simple words, in the normal scenario, if a map emits keys 1,4,3,5,2 the reducer receives the same as 1,2,3,4,5. I would like the reducer to receive 5,4,3,2,1 instead.
回答1:
In Hadoop 1.X, you can specify a custom comparator class for your outputs using JobConf.setOutputKeyComparatorClass.
Your comparator must implement the RawComparator interface.
With Hadoop 2.X, this is done by using Job.setSortComparatorClass, still with an implementation of RawComparator.
回答2:
Sample, simple code
class MyKeyComparator extends WritableComparator {
protected DescendingKeyComparator() {
super(Text.class, true);
}
@SuppressWarnings("rawtypes")
@Override
public int compare(WritableComparable w1, WritableComparable w2) {
Text key1 = (Text) w1;
Text key2 = (Text) w2;
return -1 * key1.compareTo(key2);
}
}
Then add it it to the job
job.setSortComparatorClass(MyKeyComparator.class);
you can change the below text type as per ur use.
Text key1 = (Text) w1;
Text key2 = (Text) w2;
回答3:
You can multiply your key by -1 before emitting it from your mapper. This will cause the framework to sort it in ascending order but negative values -5,-4,-3,-2,-1 and then in the reducer multiply it again by -1 resulting in 5,4,3,2,1. This will cause the framework to sort in sudo-descending order. In a more complex sort it is best to write a custom class for comparing and then set it in your driver class.
来源:https://stackoverflow.com/questions/11670953/reverse-sorting-reducer-keys