Comparison between timsort and quicksort

☆樱花仙子☆ 提交于 2019-11-28 17:43:55
Chang

TimSort is highly optimization mergesort, it is stable and faster than old mergesort.

when comparing with quicksort, it has two advantages:

  1. It is unbelievably fast for nearly sorted data sequence (including reverse sorted data);
  2. The worst case is still O(N*LOG(N)).

To be honest, I don't think #1 is a advantage, but it did impress me.

Here are QuickSort's advantages

  1. QuickSort is very very simple, even a highly tuned implementation, we can write down its pseduo codes within 20 lines;
  2. QuickSort is fastest in most cases;
  3. The memory consumption is LOG(N).

Currently, Java 7 SDK implements timsort and a new quicksort variant: i.e. Dual Pivot QuickSort.

If you need stable sort, try timsort, otherwise start with quicksort.

brc

More or less, it has to do with the fact that Timsort is a hybrid sorting algorithm. This means that while the two underlying sorts it uses (Mergesort and Insertion sort) are both worse than Quicksort for many kinds of data, Timsort only uses them when it is advantageous to do so.

On a slightly deeper level, as Patrick87 states, quicksort is a worst-case O(n2) algorithm. Choosing a good pivot isn't hard, but guaranteeing an O(n log n) quicksort comes at the cost of generally slower sorting on average.

For more detail on Timsort, see this answer, and the linked blog post. It basically assumes that most data is already partially sorted, and constructs "runs" of sorted data that allow for efficient merges using mergesort.

Generally speaking quicksort is best algorithm for primitive array. This is due to memory locality and cache.

JDK7 uses TimSort for Object array. Object array only holds object reference. The object itself is stored in Heap. To compare object, we need to read object from heap. This is like reading from one part of the heap for one object, then randomly reading object from another part of heap. There will be a lot of cache miss. I guess for this reason memory locality is not important any more. This is may be why JDK only uses TimSort for Object array instead if primitive array.

This is only my guess.

Here are benchmark numbers from my machine (i7-6700 CPU, 3.4GHz, Ubuntu 16.04, gcc 5.4.0, parameters: SIZE=100000 and RUNS=3):

$ ./demo 
Running tests
stdlib qsort time:                 12246.33 us per iteration
##quick sort time:                  5822.00 us per iteration
merge sort time:                    8244.33 us per iteration
...    
##tim sort time:                    7695.33 us per iteration
in-place merge sort time:           6788.00 us per iteration    
sqrt sort time:                     7289.33 us per iteration    
...
grail sort dyn buffer sort time:    7856.67 us per iteration

The benchmark comes from Swenson's sort project in which he as implemented several sorting algorithms in C. Presumably, his implementations are good enough to be representative, but I haven't investigated them.

So you really can't tell. Benchmark numbers only stay relevant for at most two years and then you have to repeat them. Possibly, timsort beat qsort waaay back in 2011 when the question was asked, but the times have changed. Or qsort was always the fastest, but timsort beat it on non-random data. Or Swenson's code isn't so good and a better programmer would turn the tide in timsort's favor. Or perhaps I suck and didn't use the right CFLAGS when compiling the code. Or... You get the point.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!