For 3-way Quicksort (dual-pivot quicksort), how would I go about finding the Big-O bound? Could anyone show me how to derive it?
There's a subtle difference between finding the complexity of an algorithm and proving it.
To find the complexity of this algorithm, you can do as amit said in the other answer: you know that in average, you split your problem of size n
into three smaller problems of size n/3
, so you will get, in è log_3(n)` steps in average, to problems of size 1. With experience, you will start getting the feeling of this approach and be able to deduce the complexity of algorithms just by thinking about them in terms of subproblems involved.
To prove that this algorithm runs in O(nlogn)
in the average case, you use the Master Theorem. To use it, you have to write the recursion formula giving the time spent sorting your array. As we said, sorting an array of size n
can be decomposed into sorting three arrays of sizes n/3
plus the time spent building them. This can be written as follows:
T(n) = 3T(n/3) + f(n)
Where T(n)
is a function giving the resolution "time" for an input of size n
(actually the number of elementary operations needed), and f(n)
gives the "time" needed to split the problem into subproblems.
For 3-Way quicksort, f(n) = c*n
because you go through the array, check where to place each item and eventually make a swap. This places us in Case 2 of the Master Theorem, which states that if f(n) = O(n^(log_b(a)) log^k(n))
for some k >= 0
(in our case k = 0
) then
T(n) = O(n^(log_b(a)) log^(k+1)(n)))
As a = 3
and b = 3
(we get these from the recurrence relation, T(n) = aT(n/b)
), this simplifies to
T(n) = O(n log n)
And that's a proof.
Well, the same prove actually holds.
Each iteration splits the array into 3 sublists, on average the size of these sublists is n/3
each.
Thus - number of iterations needed is log_3(n)
because you need to find number of times you do (((n/3) /3) /3) ...
until you get to one. This gives you the formula:
n/(3^i) = 1
Which is satisfied for i = log_3(n)
.
Each iteration is still going over all the input (but in a different sublist) - same as quicksort, which gives you O(n*log_3(n))
.
Since log_3(n) = log(n)/log(3) = log(n) * CONSTANT
, you get that the run time is O(nlogn)
on average.
Note, even if you take a more pessimistic approach to calculate the size of the sublists, by taking minimum of uniform distribution - it will still get you first sublist of size 1/4, 2nd sublist of size 1/2, and last sublist of size 1/4 (minimum and maximum of uniform distribution), which will again decay to log_k(n)
iterations (with a different k>2) - which will yield O(nlogn)
overall - again.
Formally, the proof will be something like:
Each iteration takes at most c_1* n
ops to run, for each n>N_1
, for some constants c_1,N_1. (Definition of big O notation, and the claim that each iteration is O(n)
excluding recursion. Convince yourself why this is true. Note that in here - "iteration" means all iterations done by the algorithm in a certain "level", and not in a single recursive invokation).
As seen above, you have log_3(n) = log(n)/log(3)
iterations on average case (taking the optimistic version here, same principles for pessimistic can be used)
Now, we get that the running time T(n)
of the algorithm is:
for each n > N_1:
T(n) <= c_1 * n * log(n)/log(3)
T(n) <= c_1 * nlogn
By definition of big O notation, it means T(n)
is in O(nlogn)
with M = c_1
and N = N_1
.
QED
来源:https://stackoverflow.com/questions/13043813/prove-3-way-quicksort-big-o-bound