I understood the way of sorting the values of a particular key before the key enters the reducer. I learned that it can be done by writing three methods viz, keycomparator,
This may be surprising to know, but each iteration of the values Iterable actually updates the key reference too:
protected void reduce(K key, Iterable values, Context context) {
for (V value : values) {
// key object contents will update for each iteration of this loop
}
}
I know this works for the new mapreduce API, i haven't traced it for the old mapred API.
So in answer to your question, all the keys will be available, the first key will relate to the first sorted key of the group.
EDIT: Some additional information as to how and why this works:
There are two comparators that the reducer uses to process the key/value pairs output by the map stage:
Under the hood, the reference to the key and value never changes, each call to Iterable.Iterator.next() advances the pointer in the underlying byte stream to the next KV pair. If the key grouper determines that the current set of keys bytes and previous set are comparatively the same key, then the hasNext method of the value Iterable.iterator() will return true, otherwise false. If true is returned, the bytes are deserialized into the Key and Value instances for consumption in your reduce method.