问题
What is the best approach to get the Map Output keys to a reducer in reverse order? By default the reducer receives all keys in ascending order of keys. Any help or comments widely appreciated.
In simple words, in the normal scenario, if a map emits keys 1,4,3,5,2 the reducer receives the same as 1,2,3,4,5. I would like the reducer to receive 5,4,3,2,1 instead.
回答1:
In Hadoop 1.X, you can specify a custom comparator class for your outputs using JobConf.setOutputKeyComparatorClass.
Your comparator must implement the RawComparator interface.
With Hadoop 2.X, this is done by using Job.setSortComparatorClass, still with an implementation of RawComparator
.
回答2:
Sample, simple code
class MyKeyComparator extends WritableComparator {
protected DescendingKeyComparator() {
super(Text.class, true);
}
@SuppressWarnings("rawtypes")
@Override
public int compare(WritableComparable w1, WritableComparable w2) {
Text key1 = (Text) w1;
Text key2 = (Text) w2;
return -1 * key1.compareTo(key2);
}
}
Then add it it to the job
job.setSortComparatorClass(MyKeyComparator.class);
you can change the below text type as per ur use.
Text key1 = (Text) w1;
Text key2 = (Text) w2;
回答3:
You can multiply your key by -1 before emitting it from your mapper. This will cause the framework to sort it in ascending order but negative values -5,-4,-3,-2,-1 and then in the reducer multiply it again by -1 resulting in 5,4,3,2,1. This will cause the framework to sort in sudo-descending order. In a more complex sort it is best to write a custom class for comparing and then set it in your driver class.
来源:https://stackoverflow.com/questions/11670953/reverse-sorting-reducer-keys