Why does Java 6 Arrays#sort(Object[]) change from mergesort to insertionsort for small arrays?

后端 未结 3 1610
慢半拍i
慢半拍i 2020-12-15 04:42

Java 6\'s mergesort implementation in Arrays.java uses an insertion-sort if the array length is less than some threshold. This value is hard-coded to 7. As th

相关标签:
3条回答
  • 2020-12-15 05:21

    Yes this is intentional. While the Big-O of mergesort is less than that of quadratic sorts such as insertion sort, the operations it does are more complex and thus slower.

    Consider sorting an array of length 8. Merge sort makes ~14 recursive calls to itself in addition to 7 merge operations. Each recursive call contributes some non-trivial overhead to the run-time. Each merge operation involves a loop where index variables must be initialized, incremented, and compared, temporary arrays must be copied, etc. All in all, you can expect well over 300 "simple" operations.

    On the other hand, insertion sort is inherently simple and uses about 8^2=64 operations which is much faster.

    Think about it this way. When you sort a list of 10 numbers by hand, do you use merge sort? No, because your brain is much better at doing simple things like like insertion sort. However if I gave you a year to sort a list of 100,000 numbers, you might be more inclined to merge sort it.

    As for the magic number 7, it is empirically derived to be optimal.

    EDIT: In a standard insertion sort of 8 elements, the worst case scenario leads to ~36 comparisons. In a canonical merge sort, you have ~24 comparisons. Adding in the overhead from the method calls and complexity of operations, insertion sort should be faster. Additionally if you look at the average case, insertion sort would make far fewer comparisons than 36.

    0 讨论(0)
  • 2020-12-15 05:32

    My understanding is that this is an empirically derived value, where the time required for an insertion sort is actually lower, despite a (possible) higher number of comparisons required. This is so because near the end of a mergesort, the data is likely to be almost sorted, which makes insertion sort perform well.

    0 讨论(0)
  • 2020-12-15 05:39

    Insertion sort is n(n-1)/2 and merge sort is n*(log n with base 2 ).

    Considering this -

    1. For Array of Length 5 => Insetion sort = 10 and merge sort is 11.609
    2. For Array of Length 6 => Insetion sort = 15 and merge sort is 15.509
    3. For Array of Length 7 => Insetion sort = 21 and merge sort is 19.651
    4. For Array of Length 8 => Insetion sort = 28 and merge sort is 24

    From above data it is clear, till length 6, insetion sort is faster and after 7, merge sort is efficient.

    That explains why 7 is used.

    0 讨论(0)
提交回复
热议问题