问题
I'm trying to create an unusual associative array implementation that is very space-efficient, and I need a sorting algorithm that meets all of the following:
- Stable (Does not change the relative ordering of elements with equal keys.)
- In-place or almost in-place (O(log n) stack is fine, but no O(n) space usage or heap allocations.
- O(n log n) time complexity.
Also note that the data structure to be sorted is an array.
It's easy to see that there's a basic algorithm that matches any 2 of these three (insertion sort matches 1 and 2, merge sort matches 1 and 3, heap sort matches 2 and 3), but I cannot for the life of me find anything that matches all three of these criteria.
回答1:
Merge sort can be written to be in-place I believe. That may be the best route.
回答2:
Note: standard quicksort is not O(n log n) ! In the worst case, it can take up to O(n^2) time. The problem is that you might pivot on an element which is far from the median, so that your recursive calls are highly unbalanced.
There is a way to combat this, which is to carefully pick a median which is guaranteed, or at least very likely, to be close to the median. It is surprising that you can actually find the exact median in linear time, although in your case it sounds like you care about speed so I would not suggest this.
I think the most practical approach is to implement a stable quicksort (it's easy to keep stable) but use the median of 5 random values as the pivot at each step. This makes it highly unlikely that you'll have a slow sort, and is stable.
By the way, merge sort can be done in-place, although it's tricky to do both in-place and stable.
回答3:
What about quicksort?
Exchange can do that too, might be more "stable" by your terms, but quicksort is faster.
回答4:
There's a list of sort algorithms on Wikipedia. It includes categorization by execution time, stability, and allocation.
Your best bet is probably going to be modifying an efficient unstable sort to be stable, thereby making it less efficient.
回答5:
There is a class of stable in-place merge algorithms, although they are complicated and linear with a rather high constant hidden in the O(n). To learn more, have a look at this article, and its bibliography.
Edit: the merge phase is linear, thus the mergesort is nlog_n.
回答6:
Because your elements are in an array (rather than, say, a linked list) you have some information about their original order available to you in the array indices themselves. You can take advantage of this by writing your sort and comparison functions to be aware of the indices:
function cmp( ar, idx1, idx2 )
{
// first compare elements as usual
rc = (ar[idx1]<ar[idx2]) ? -1 : ( (ar[idx1]>ar[idx2]) ? 1 : 0 );
// if the elements are identical, then compare their positions
if( rc != 0 )
rc = (idx1<idx2) ? -1 : ((idx1>idx2) ? 1 : 0);
return rc;
}
This technique can be used to make any sort stable, as long as the sort ONLY performs element swaps. The indices of elements will change, but the relative order of identical elements will stay the same, so the sort remains robust. It won't work out of the box for a sort like heapsort because the original heapification "throws away" the relative ordering, though you might be able to adapt the idea to other sorts.
回答7:
Quicksort can be made stable reasonably easy simply by having an sequence field added to each record, initializing it to the index before sorting and using it as the least significant part of the sort key.
This has a slightly adverse effect on the time taken but it doesn't affect the time complexity of the algorithm. It also has a minimal storage cost overhead for each record, but that rarely matters until you get very large numbers of records (and is mimimized with larger record sizes).
I've used this method with C
's qsort()
function to avoid writing my own. Each record has a 32-bit integer added and populated with the starting sequence number before calling qsort()
.
Then the comparison function checked the keys and the sequence (this guarantees there are no duplicate keys), turning the quicksort into a stable one. I recall that it still outperformed the inherently stable mergesort for the data sets I was using.
Your mileage may vary, so always remember: Measure, don't guess!
回答8:
Quicksort can be made stable by doing it on a linked list. This costs n to pick random or median of 3 pivots but with a very small constant (list traversal).
By splitting the list and ensuring that the left list is sorted so same values go left and the right list is sorted so the same values go right, the sort will be implicity stable for no real extra cost. Also, since this deals with assignment rather than swapping, I think the speed might actually be slightly better than a quick sort on an array since there's only a single write.
So in conclusion, list up all your items and run quicksort on a list
回答9:
Perhaps shell sort? If I recall my data structures course correctly, it tended to be stable, but it's worse case time is O(n log^2 n), although it performs O(n) on almost sorted data. It's based on insertion sort, so it sorts in place.
回答10:
Don't worry too much about O(n log n) until you can demonstrate that it matters. If you can find an O(n^2) algorithm with a drastically lower constant, go for it!
The general worst-case scenario is not relevant if your data is highly constrained.
In short: Run some test.
回答11:
There's a nice list of sorting functions on wikipedia that can help you find whatever type of sorting function you're after.
For example, to address your specific question, it looks like an in-place merge sort is what you want.
However, you might also want to take a look at strand sort, it's got some very interesting properties.
回答12:
I have implemented a stable in-place quicksort and a stable in-place merge sort. The merge sort is a bit faster, and guaranteed to work in O(n*log(n)^2), but not the quicksort. Both use O(log(n)) space.
回答13:
Maybe I'm in a bit of a rut, but I like hand-coded merge sort. It's simple, stable, and well-behaved. The additional temporary storage it needs is only N*sizeof(int)
, which isn't too bad.
来源:https://stackoverflow.com/questions/113025/stable-efficient-sort