I am writing code for the external merge sort. The idea is that the input files contain too many numbers to be stored in an array so you read some of it and put it into files to
You might wish to merge k>2 segments at a time. This reduces the amount of I/O from n log k / log 2 to n log n / log k.
Edit: In pseudocode, this would look something like this:
void sort(List list) {
if (list fits in memory) {
list.sort();
} else {
sublists = partition list into k about equally big sublists
for (sublist : sublists) {
sort(sublist);
}
merge(sublists);
}
}
void merge(List[] sortedsublists) {
keep a pointer in each sublist, which initially points to its first element
do {
find the pointer pointing at the smallest element
add the element it points to to the result list
advance that pointer
} until all pointers have reached the end of their sublist
return the result list
}
To efficiently find the "smallest" pointer, you might employ a PriorityQueue
.