What is the worst case scenario for quicksort?

后端 未结 6 877
醉酒成梦
醉酒成梦 2020-12-03 11:28

When does the quicksort algorithm take O(n^2) time?

6条回答
  •  情书的邮戳
    2020-12-03 11:50

    In short, Quicksort for sorting an array lowest element first works like this:

    1. Choose a pivot element
    2. Presort array, such that all elements smaller than the pivot are on the left side
    3. Recursively do step 1. and 2. for the left side and the right side

    Ideally, you would want a pivot element that partitions the sequence in two equally long subsequences but this is not so easy.

    There are different schemes for choosing the pivot element. Early versions just took the leftmost element. In the worst case, the pivot element will always be the lowest element of the current range.

    Leftmost element is pivot

    In this case it can be easily thought out that the worst case is an monotonic increasing array:

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
    

    Rightmost element is pivot

    Similarly, when choosing the rightmost element the worst case will be a decreasing sequence.

    20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    

    Center element is pivot

    One possible remedy for the worst-case for presorted arrays, is to use the center element (or slightly left of center if the sequence is of even length). Then, the worst case would be quite more exotic. It can be constructed by modifying the Quicksort algorithm to set the array elements corresponding to the currently selected pivot element to a monotonic increasing value. I.e. we know the first pivot is the center, so the center must be the lowest value, e.g. 0. Next it gets swapped to the leftmost, i.e. the leftmost value is now in the center and would be the next pivot element, so it must be 1. Now, we can already guess that the array would look like this:

    1 ? ? 0 ? ? ?
    

    Here is the C++ code for the modified Quicksort to generate a worst sequence:

    // g++ -std=c++11 worstCaseQuicksort.cpp && ./a.out
    
    #include     // swap
    #include 
    #include 
    #include       // iota
    
    int main( void )
    {
        std::vector v(20); /**< will hold the worst case later */
    
        /* p basically saves the indices of what was the initial position of the
         * elements of v. As they get swapped around by Quicksort p becomes a
         * permutation */
        auto p = v;
        std::iota( p.begin(), p.end(), 0 );
    
        /* in the worst case we need to work on v.size( sequences, because
         * the initial sequence is always split after the first element */
        for ( auto i = 0u; i < v.size(); ++i )
        {
            /* i can be interpreted as:
             *   - subsequence starting index
             *   - current minimum value, if we start at 0 */
            /* note thate in the last step iPivot == v.size()-1 */
            auto const iPivot = ( v.size()-1 + i )/2;
            v[ p[ iPivot ] ] = i;
            std::swap( p[ iPivot ], p[i] );
        }
    
        for ( auto x : v ) std::cout << " " << x;
    }
    

    The result:

                                        0
                                        0  1
                                     1  0  2
                                     2  0  1  3
                                  1  3  0  2  4
                                  4  2  0  1  3  5
                               1  5  3  0  2  4  6
                               4  2  6  0  1  3  5  7
                            1  5  3  7  0  2  4  6  8
                            8  2  6  4  0  1  3  5  7  9
                         1  9  3  7  5  0  2  4  6  8 10
                         6  2 10  4  8  0  1  3  5  7  9 11
                      1  7  3 11  5  9  0  2  4  6  8 10 12
                     10  2  8  4 12  6  0  1  3  5  7  9 11 13
                   1 11  3  9  5 13  7  0  2  4  6  8 10 12 14
                   8  2 12  4 10  6 14  0  1  3  5  7  9 11 13 15
                1  9  3 13  5 11  7 15  0  2  4  6  8 10 12 14 16
               16  2 10  4 14  6 12  8  0  1  3  5  7  9 11 13 15 17
             1 17  3 11  5 15  7 13  9  0  2  4  6  8 10 12 14 16 18
            10  2 18  4 12  6 16  8 14  0  1  3  5  7  9 11 13 15 17 19
          1 11  3 19  5 13  7 17  9 15  0  2  4  6  8 10 12 14 16 18 20
         16  2 12  4 20  6 14  8 18 10  0  1  3  5  7  9 11 13 15 17 19 21
       1 17  3 13  5 21  7 15  9 19 11  0  2  4  6  8 10 12 14 16 18 20 22
      12  2 18  4 14  6 22  8 16 10 20  0  1  3  5  7  9 11 13 15 17 19 21 23
    1 13  3 19  5 15  7 23  9 17 11 21  0  2  4  6  8 10 12 14 16 18 20 22 24
    

    There is order in this. The right side is just increments of two starting with zero. The left side also has an order. Let's format the left side for the 73 element long worst case sequence nicely using Ascii art:

      0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
    ------------------------------------------------------------------------------------------------------------
      1     3     5     7     9    11    13    15    17    19    21    23    25    27    29    31    33    35   
        37          39          41          43          45          47          49          51          53      
              55                      57                      59                      61                      63
                                                  65                                              67            
                                                                          69                                    
                          71
    

    The header is the element index. In the first row numbers starting from 1 and increasing by 2 are given to every 2nd element. In the second row the same is done to every 4th element, in the 3rd row numbers are assigned to every 8th element and so on. In this case the first value to be written in the i-th row is at index 2^i-1, but for certain lengths this looks a tad different.

    The resulting structure is reminiscent to an inverted binary tree whose nodes are labeled bottom-up starting from the leaves.

    Median of leftmost, center and rightmost elements is pivot

    Another way is to use the median of the leftmost, the center and the rightmost element. In this case the worst case can only be, that the w.l.o.g. left subsequence is of length 2 (not just length 1 like in the examples above). Also we assume that the rightmost value will always be the highest of the median-of-three. This also means it is the highest of all values. Making adjustments in the program above, we now have this:

    auto p = v;
    std::iota( p.begin(), p.end(), 0 );
    auto i = 0u;
    for ( ; i < v.size(); i+=2 )
    {
        auto const iPivot0 = i;
        auto const iPivot1 = ( i + v.size()-1 )/2;
        v[ p[ iPivot1 ] ] = i+1;
        v[ p[ iPivot0 ] ] = i;
        std::swap( p[ iPivot1 ], p[i+1] );
    }
    if ( v.size() > 0 && i == v.size() )
        v[ v.size()-1 ] = i-1;
    

    The generated sequences are:

                                         0
                                         0  1
                                      0  1  2
                                      0  1  2  3
                                   0  2  1  3  4
                                   0  2  1  3  4  5
                                0  4  2  1  3  5  6
                                0  4  2  1  3  5  6  7
                             0  4  2  6  1  3  5  7  8
                             0  4  2  6  1  3  5  7  8  9
                          0  8  2  6  4  1  3  5  7  9 10
                          0  8  2  6  4  1  3  5  7  9 10 11
                       0  6  2 10  4  8  1  3  5  7  9 11 12
                       0  6  2 10  4  8  1  3  5  7  9 11 12 13
                    0 10  2  8  4 12  6  1  3  5  7  9 11 13 14
                    0 10  2  8  4 12  6  1  3  5  7  9 11 13 14 15
                 0  8  2 12  4 10  6 14  1  3  5  7  9 11 13 15 16
                 0  8  2 12  4 10  6 14  1  3  5  7  9 11 13 15 16 17
              0 16  2 10  4 14  6 12  8  1  3  5  7  9 11 13 15 17 18
              0 16  2 10  4 14  6 12  8  1  3  5  7  9 11 13 15 17 18 19
           0 10  2 18  4 12  6 16  8 14  1  3  5  7  9 11 13 15 17 19 20
           0 10  2 18  4 12  6 16  8 14  1  3  5  7  9 11 13 15 17 19 20 21
        0 16  2 12  4 20  6 14  8 18 10  1  3  5  7  9 11 13 15 17 19 21 22
        0 16  2 12  4 20  6 14  8 18 10  1  3  5  7  9 11 13 15 17 19 21 22 23
     0 12  2 18  4 14  6 22  8 16 10 20  1  3  5  7  9 11 13 15 17 19 21 23 24
    

    Pseudorandom element with random seed 0 is pivot

    The worst case sequences for center element and median-of-three look already pretty random, but in order to make Quicksort even more robust the pivot element can be chosen randomly. If the random sequence used is at least reproducible on every Quicksort run, then we can also construct a worst case sequence for that. We only have to adjust the iPivot = line in the first program, e.g. to:

    srand(0); // you shouldn't use 0 as a seed
    for ( auto i = 0u; i < v.size(); ++i )
    {
        auto const iPivot = i + rand() % ( v.size() - i );
        [...]
    

    The generated sequences are:

                                         0
                                         1  0
                                      1  0  2
                                      2  3  1  0
                                   1  4  2  0  3
                                   5  0  1  2  3  4
                                6  0  5  4  2  1  3
                                7  2  4  3  6  1  5  0
                             4  0  3  6  2  8  7  1  5
                             2  3  6  0  8  5  9  7  1  4
                          3  6  2  5  7  4  0  1  8 10  9
                          8 11  7  6 10  4  9  0  5  2  3  1
                       0 12  3 10  6  8 11  7  2  4  9  1  5
                       9  0  8 10 11  3 12  4  6  7  1  2  5 13
                    2  4 14  5  9  1 12  6 13  8  3  7 10  0 11
                    3 15  1 13  5  8  9  0 10  4  7  2  6 11 12 14
                11 16  8  9 10  4  6  1  3  7  0 12  5 14  2 15 13
                 6  0 15  7 11  4  5 14 13 17  9  2 10  3 12 16  1  8
              8 14  0 12 18 13  3  7  5 17  9  2  4 15 11 10 16  1  6
              3  6 16  0 11  4 15  9 13 19  7  2 10 17 12  5  1  8 18 14
           6  0 14  9 15  2  8  1 11  7  3 19 18 16 20 17 13 12 10  4  5
          14 16  7  9  8  1  3 21  5  4 12 17 10 19 18 15  6  0 11  2 13 20
        1  2 22 11 16  9 10 14 12  6 17  0  5 20  4 21 19  8  3  7 18 15 13
       22  1 15 18  8 19 13  0 14 23  9 12 10  5 11 21  6  4 17  2 16  7  3 20
     2 19 17  6 10 13 11  8  0 16 12 22  4 18 15 20  3 24 21  7  5 14  9  1 23
    

    So how to check whether those sequences are correct?

    • Measure time it took for the sequences. Plot time over the sequence length N. If the curve scales with O(N^2) instead of O(N log(N)), then these are indeed worst case sequences.
    • Adjust a correct Quicksort to give debug output about the subsequence lengths and/or the chosen pivot elements. One of the subsequences should always be of length 1 (or 2 for median-of-three). The chosen pivot elements printed should be increasing.

提交回复
热议问题