问题
Setup
I have the following two implementations of a matrix-calculation:
- The first implementation uses a
matrix of shape (n, m)
and the calculation is repeated in a for-loop forrepetition
-times:
import numpy as np
from numba import jit
@jit
def foo():
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[i, j] +
#some constants added here
)
_deleteB = (
matrix[i, j-1] +
#some constants added here
)
matrix[i, j] = min(_deleteA, _deleteB)
return matrix
repetition = 3
for x in range(repetition):
foo()
2. The second implementation avoids the extra for-loop and, hence, includes repetition = 3
into the matrix, which is then of shape (repetition, n, m)
:
@jit
def foo():
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[:, i, j] +
#some constants added here
)
_deleteB = (
matrix[:, i, j-1] +
#some constants added here
)
matrix[:, i, j] = np.amin(np.stack((_deleteA, _deleteB), axis=1), axis=1)
return matrix
Questions
Regarding both implementations, I discovered two things regarding their performance with %timeit
in iPython.
- The first implementation profits hugely from
@jit
, while the second does not at all (28ms vs. 25sec in my testcase). Can anybody imagine why@jit
does not work anymore with a numpy-array of shape(repetition, n, m)
?
Edit
I moved the former second question to an extra post since asking multiple questions is concidered bad SO-style.
The question was:
- When neglecting
@jit
, the first implementation is still a lot faster (same test-case: 17sec vs. 26sec). Why is numpy slower when working on three instead of two dimensions?
回答1:
I'm not sure what your setup is here, but I re-wrote your example slightly:
import numpy as np
from numba import jit
#@jit(nopython=True)
def foo(matrix):
n, m = matrix.shape
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[i, j] #+
#some constants added here
)
_deleteB = (
matrix[i, j-1] #+
#some constants added here
)
matrix[i, j] = min(_deleteA, _deleteB)
return matrix
foo_jit = jit(nopython=True)(foo)
and then timings:
m = np.random.normal(size=(100,50))
%timeit foo(m) # in a jupyter notebook
# 2.84 ms ± 54.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit foo_jit(m) # in a jupyter notebook
# 3.18 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
So here numba is a lot faster as expected. One thing to consider is that global numpy arrays do not behave in numba as you might expect:
https://numba.pydata.org/numba-doc/dev/user/faq.html#numba-doesn-t-seem-to-care-when-i-modify-a-global-variable
It's usually better to pass in the data as I did in the example.
Your issue in the second case is that numba does not support amin
at this time. See:
https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
You can see this if you pass nopython=True
to jit
. So in current versions of numba (0.44 or earlier at current), it will fall back to objectmode
which often is no faster than not using numba and sometimes is slower since there is some call overhead.
来源:https://stackoverflow.com/questions/56992398/numba-on-nested-numpy-arrays