Process Array in parallel using GCD

痞子三分冷 提交于 2019-11-28 09:12:47

This is a slightly different take on the approach in @Eduardo's answer, using the Array type's withUnsafeMutableBufferPointer<R>(body: (inout UnsafeMutableBufferPointer<T>) -> R) -> R method. That method's documentation states:

Call body(p), where p is a pointer to the Array's mutable contiguous storage. If no such storage exists, it is first created.

Often, the optimizer can eliminate bounds- and uniqueness-checks within an array algorithm, but when that fails, invoking the same algorithm on body's argument lets you trade safety for speed.

That second paragraph seems to be exactly what's happening here, so using this method might be more "idiomatic" in Swift, whatever that means:

func calcSummary() {
    let group = dispatch_group_create()
    let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)

    self.summary.withUnsafeMutableBufferPointer {
        summaryMem -> Void in
        for i in 0 ..< 10 {
            dispatch_group_async(group, queue, {
                let base = i * 50000
                for x in base ..< base + 50000 {
                    summaryMem[i] += self.array[x]
                }
            })
        }
    }

    dispatch_group_notify(group, queue, {
        println(self.summary)
    })
}

When you use the += operator, the LHS is an inout parameter -- I think you're getting race conditions when, as you mention in your update, Swift moves around the array for optimization. I was able to get it to work by summing the chunk in a local variable, then simply assigning to the right index in summary:

func calcSummary() {
    let group =  dispatch_group_create()
    let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)

    for i in 0 ..< 10 {
        dispatch_group_async(group, queue, {
            let base = i * 50000
            var sum = 0
            for x in base ..< base + 50000 {
                sum += self.array[x]
            }
            self.summary[i] = sum
        })
    }

    dispatch_group_notify(group, queue, {
        println(self.summary)
    })
}

I think Nate is right: there are race conditions with the summary variable. To fix it, I used summary's memory directly:

func calcSummary() {
    let group = dispatch_group_create()
    let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)

    let summaryMem = UnsafeMutableBufferPointer<Int>(start: &summary, count: 10)

    for i in 0 ..< 10 {
        dispatch_group_async(group, queue, {
           let base = i * 50000
           for x in base ..< base + 50000 {
              summaryMem[i] += self.array[x]
           }
        })
    }

    dispatch_group_notify(group, queue, {
        println(self.summary)
    })
}

This works (so far).

EDIT Mike S has a very good point, in his comment below. I have also found this blog post, which sheds some light on the problem.

You can also use concurrentPerform(iterations: Int, execute work: (Int) -> Swift.Void) (since Swift 3).

It has a much simpler syntax:

DispatchQueue.concurrentPerform(iterations: iterations) {i in
        performOperation(i)
}

and will wait for all threads to finalise before returning.

Any solution that assigns the i'th element of the array concurrently risks race condition (Swift's array is not thread-safe). On the other hand, dispatching to the same queue (in this case main) before updating solves the problem but results in a slower performance overall. The only reason I see for taking either of these two approaches is if the array (summary) cannot wait for all concurrent operations to finish.

Otherwise, perform the concurrent operations on a local copy and assign it to summary upon completion. No race condition, no performance hit:

Swift 4

func calcSummary(of array: [Int]) -> [Int] {
    var summary = Array<Int>.init(repeating: 0, count: array.count)

    let iterations = 10 // number of parallel operations  

    DispatchQueue.concurrentPerform(iterations: iterations) { index in
        let start = index * array.count / iterations
        let end = (index + 1) * array.count / iterations

        for i in start..<end {
            // Do stuff to get the i'th element
            summary[i] = Int.random(in: 0..<array.count)
        }
    }

    return summary
}

I've answered a similar question here for simply initializing an array after computing on another array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!