What sort does Java Collections.sort(nodes) use?

余生长醉 提交于 2019-11-27 07:54:14

O(n log n) doesn't mean that the number of comparisons will be equal to or less than n log n, just that the time taken will scale proportionally to n log n. Try doing tests with 8 nodes, or 16 nodes, or 32 nodes, and checking out the timing.

You sorted four nodes, so you didn't get merge sort; sort switched to insertion sort.

In Java, the Arrays.sort() methods use merge sort or a tuned quicksort depending on the datatypes and for implementation efficiency switch to insertion sort when fewer than seven array elements are being sorted. (Wikipedia, emphasis added)

Arrays.sort is used indirectly by the Collections classes.

A recently accepted bug report indicates that the Sun implementation of Java will use Python's timsort in the future: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124

(The timsort monograph, linked above, is well worth reading.)

An algorithm A(n) that processes an amount of data n is in O(f(n)), for some function f, if there exist two strictly positive constants C_inf and C_sup such that:

C_inf . f(n) < ExpectedValue(OperationCount(A(n))) < C_sup . f(n)

Two things to note:

  • The actual constants C could be anything, and do depend on the relative costs of operations (depending on the language, the VM, the architecture, or your actual definition of an operation). On some platforms, for instance, + and * have the same cost, on some other the later is an order of magnitude slower.

  • The quantity ascribed as "in O(f(n))" is an expected operation count, based on some probably arbitrary model of the data you are dealing with. For instance, if your data is almost completely sorted, a merge-sort algorithm is going to be mostly O(n), not O(n . Log(n)).

I've written some stuff you may be interested in about the Java sort algorithm and taken some performance measurements of Collections.sort(). The algorithm at present is a mergesort with an insertion sort once you get down to a certain size of sublists (N.B. this algorithm is very probably going to change in Java 7).

You should really take the Big O notation as an indication of how the algorithm will scale overall; for a particular sort, the precise time will deviate from the time predicted by this calculation (as you'll see on my graph, the two sort algorithms that are combined each have different performance characteristics, and so the overall time for a sort is a bit more complex).

That said, as a rough guide, for every time you double the number of elements, if you multiply the expected time by 2.2, you won't be far out. (It doesn't make much sense really to do this for very small lists of a few elements, though.)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!