Which general purpose sorting algorithm does Swift use? It does not perform well on sorted data

孤街醉人 提交于 2020-01-12 07:01:13

问题


I have been picking and probing at Swift standard libraries sort() function for its Array type. To my surprise I have noticed it performs poorly on already-sorted data.

Sorting an array of Int which is shuffled seems to be 5x faster than sorting that very same array when it is already sorted. Sorting an array of shuffled objects is about 4x faster than sorting the very same one already in sorted order (sorting object array vs Int array use different algorithms I am sure so I sorted both to eliminate bias).

These are the results:

Shuffled Int array sort time: 1.3961209654808
Shuffled ColorObject array sort time: 3.14633798599243
NOnshuffled Int array sort time: 7.34714204072952
NOnshuffled ColorObject array sort time: 10.9310839772224

For reference below is my code:

class ElapsedTimer {

    let startTime: CFAbsoluteTime
    var endTime: CFAbsoluteTime?

    init() {
        startTime = CFAbsoluteTimeGetCurrent()
    }

    func stop() -> CFAbsoluteTime {
        endTime = CFAbsoluteTimeGetCurrent()
        return duration!
    }

    var duration: CFAbsoluteTime? {
        if let endTime = endTime {
            return endTime - startTime
        } else {
            return nil
        }
    }
}

public class CountedColor {
    public private(set) var count: Int
    public private(set) var color: UIColor

    public init(color: UIColor, colorCount: Int) {
        self.count = colorCount
        self.color = color
    }
}

var distributedIntArray = [Int]()

for value in 1..<1000000 {
    distributedIntArray.append(value)
}

var distributedCountedColorArray = distributedIntArray.map{ CountedColor(color: UIColor.white, colorCount: $0) }

distributedCountedColorArray.shuffle()
distributedIntArray.shuffle()

var timer = ElapsedTimer()
distributedIntArray.sort()
print("Shuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedCountedColorArray.sort{ return $0.count < $1.count }
print("Shuffled Color array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedIntArray.sort()
print("NOnshuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedCountedColorArray.sort{ return $0.count < $1.count }
print("Non shuffled Color array sort time: \(timer.stop())")

My array shuffle() method was pulled from this post. My ElapsedTimer simply wraps and uses CACurrentMediaTime() functions.

My question is why am I seeing this behavior? Especially when I am sorting the object array which should surely be using a general purpose sort. What kind of general purpose sorting algorithm is swift using? It surely can’t be one where the worst case and average case are the same like mergeSort.


回答1:


Swift uses Introsort. Looking at the source code we see that the chosen pivot is the first element. The wikipedia page on Introsort says:

(...), one of the critical operations is choosing the pivot: the element around which the list is partitioned. The simplest pivot selection algorithm is to take the first or the last element of the list as the pivot, causing poor behavior for the case of sorted or nearly sorted input.

Thus it is entirely predictable, given the implementation choice, that Swift's sorting performance is worst for sorted inputs.

I have built a complete benchmark for people who want to easily reproduce the OP's claims : https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/sort

For reference, the GNU ISO C++ standard library uses a median-of-3 pivot (as per the stl_algo.h header).




回答2:


This is a duplicate of this question it seems: Swift sorting algorithm implementation

Furthermore the reason why it performs so much better when shuffled is probably just because when its shuffled its performance is not hitting the upper bound of NlogN. Sorting the sorted array probably gets closer to that upper limit so its still all in all the same sort. But I dont know this is just theory



来源:https://stackoverflow.com/questions/41031106/which-general-purpose-sorting-algorithm-does-swift-use-it-does-not-perform-well

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!