quick sort algorithm improvement if more duplicate keys

人盡茶涼 提交于 2019-11-28 23:46:52

>>Why above quick sort algorithm does not work effectively if more duplicate keys are present?

It becomes inefficient because your breaking condition is: if(i >= j) break;
so, as you scan from both sides with i and j, It is quite possible that you break when i == j instead of letting i surpass over j.

What bad could possibly happen when we break on i==j when many duplicate keys are present ?

When you break for i==j; from first while loop you must have had a[i] >= v and from second while loop a[j] <=v but since we are considering a 'break' for: i==j so, a[i] = a[j] = v i.e. a[i] is same as v, your pivot element.

In such a scenario, your outermost exch(a[i], a[r]); will simply exchange pivot value to itself.
Hence, in your next recursive call quicksort(a, i+1, r); for Right-half of the array, you would have minimum element sitting at the rightmost end.( your pivot choosing strategy is simply, item v = a[r]; ) and we all know it is bad for QuickSort to choose a pivot element which amounts to the minimum or the maximum of the array. Hence, your subsequent recursive call for right-half will be a degenerate one.
That is why author is advising not to break for i==j but catch them just before that happens.

>>What does author mean by degenerate here?

Degenerate here means, the recursion tree is getting skewed i.e. the subsequent problems are not being generated of nearly equal sizes. You are dividing a problem of size N into something like problems of size N-1 and 1 instead of something more balanced, like dividing it into problems of size N/2 and N/2.

>>How can we modify above program with description below?

We could implement it like following:

int partition(int A[], int l, int r){
        int i=l-1, j=r, v = A[r];
        for(;;){
                while(A[++i] < v);
                while(A[--j] > v)
                        if(j == l)
                                break;
                if(i>=j)
                        break;
                swap(A[i], A[j]);
        }
        if(i == j){// case when we stopped at the pivot element.
                j = j+1;//backtrack j 1 step.
                if(j <= r)
                    swap(A[j], A[r]);
                return j;// partition the subsequent problems around j now.
        }
        swap(A[i], A[r]);
        return i;
}

>>How above modification improve if more duplication keys are present?
It improves the performance by letting you NOT generate an obvious scenario of a degenerate case.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!