Numbas parallel vectorized functions

问题

I'm currently experimenting with numba and especially vectorized functions, so I created a sum vectorized function (because it is easy to compare this to np.sum.

import numpy as np
import numba as nb

@nb.vectorize([nb.float64(nb.float64, nb.float64)])
def numba_sum(element1, element2):
    return element1 + element2

@nb.vectorize([nb.float64(nb.float64, nb.float64)], target='parallel')
def numba_sum_parallel(element1, element2):
    return element1 + element2

array = np.ones(elements)
np.testing.assert_almost_equal(numba_sum.reduce(array), np.sum(array))
np.testing.assert_almost_equal(numba_sum_parallel.reduce(array), np.sum(array))

Depending on the number of elements the parallel code does not return the same number as the cpu targeted code. I think that's because of something related to the usual threading-problems (but why? Is that a Bug in Numba or something that just happens when using parallel execution?). Funny is that sometimes it works, sometimes it does not. Sometimes it fails with elements=1000 sometimes it starts failing on elements=100000.

For example:

AssertionError: 
Arrays are not almost equal to 7 decimals
 ACTUAL: 93238.0
 DESIRED: 100000.0

and if I run it again

AssertionError: 
Arrays are not almost equal to 7 decimals
 ACTUAL: 83883.0
 DESIRED: 100000.0

My question is now: Why would I ever want a parallel vectorized function? My understanding is that the purpose of a vectorized function is to provide the numpy-ufunc possibilities but I tested reduce and accumulate and they stop working at some (variable) number of elements and who wants an unreliable function?

I'm using numba 0.23.1, numpy 1.10.1 with python 3.5.1.

回答1:

You ask:

where would "parallel" vectorized functions make sense given that it can lead to such problems

Given that ufuncs produced by numba.vectorize(target='parallel') have defective reduce() methods, the question is what can we do with them that is useful?

In your case, the ufunc does addition. A useful application of this with target='parallel' is elementwise addition of two arrays:

numba_sum(array, array)

This is indeed faster than a single-core solution, and seems not to be impacted by the bugs that cripple reduce() and friends.

来源：https://stackoverflow.com/questions/35459065/numbas-parallel-vectorized-functions

标签

python

numpy

parallel-processing

numba