问题
I'm currently experimenting with numba
and especially vectorized
functions, so I created a sum
vectorized function (because it is easy to compare this to np.sum
.
import numpy as np
import numba as nb
@nb.vectorize([nb.float64(nb.float64, nb.float64)])
def numba_sum(element1, element2):
return element1 + element2
@nb.vectorize([nb.float64(nb.float64, nb.float64)], target='parallel')
def numba_sum_parallel(element1, element2):
return element1 + element2
array = np.ones(elements)
np.testing.assert_almost_equal(numba_sum.reduce(array), np.sum(array))
np.testing.assert_almost_equal(numba_sum_parallel.reduce(array), np.sum(array))
Depending on the number of elements
the parallel code does not return the same number as the cpu
targeted code. I think that's because of something related to the usual threading-problems (but why? Is that a Bug in Numba or something that just happens when using parallel execution?). Funny is that sometimes it works, sometimes it does not. Sometimes it fails with elements=1000
sometimes it starts failing on elements=100000
.
For example:
AssertionError:
Arrays are not almost equal to 7 decimals
ACTUAL: 93238.0
DESIRED: 100000.0
and if I run it again
AssertionError:
Arrays are not almost equal to 7 decimals
ACTUAL: 83883.0
DESIRED: 100000.0
My question is now: Why would I ever want a parallel vectorized function? My understanding is that the purpose of a vectorized
function is to provide the numpy-ufunc possibilities but I tested reduce
and accumulate
and they stop working at some (variable) number of elements and who wants an unreliable function?
I'm using numba 0.23.1
, numpy 1.10.1
with python 3.5.1
.
回答1:
You ask:
where would "parallel" vectorized functions make sense given that it can lead to such problems
Given that ufuncs produced by numba.vectorize(target='parallel')
have defective reduce()
methods, the question is what can we do with them that is useful?
In your case, the ufunc does addition. A useful application of this with target='parallel'
is elementwise addition of two arrays:
numba_sum(array, array)
This is indeed faster than a single-core solution, and seems not to be impacted by the bugs that cripple reduce()
and friends.
来源:https://stackoverflow.com/questions/35459065/numbas-parallel-vectorized-functions