How to speed up multiply and sum operations in numpy [duplicate]

北城余情 提交于 2021-02-04 21:31:08

问题


I need to solve a Finite Element Method problem and have to calculate the following C from A and B with a large M (M>1M). For example,

import numpy as np
M=4000000
A=np.random.rand(4, M, 3)
B=np.random.rand(M,3)
C = (A * B).sum(axis = -1) # need to be optimized

Could anyone come up with a code which is faster than (A * B).sum(axis = -1)? You can reshape or re-arrange the axes of A, B, and C freely.


回答1:


You can use np.einsum for a slightly more efficient approach, both in performance and memory usage:

M=40000
A=np.random.rand(4, M, 3)
B=np.random.rand(M,3)
out = (A * B).sum(axis = -1) # need to be optimized

%timeit (A * B).sum(axis = -1) # need to be optimized
# 5.23 ms ± 198 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.einsum('ijk,jk->ij', A, B)
# 1.31 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

np.allclose(out, np.einsum('ijk,jk->ij', A, B))
# True



回答2:


To speed up numpy multiplication in general, one possible approach is using ctypes. However, as far as I know, this approach probably will give you limited performance improvements (if any).




回答3:


You could use NumExpr like this for a 3x speedup:

import numpy as np
import numexpr as ne

M=40000
A=np.random.rand(4, M, 3)
B=np.random.rand(M,3)

%timeit out = (A * B).sum(axis = -1)
2.12 ms ± 57.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit me = ne.evaluate('sum(A*B,2)')
662 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


out = (A * B).sum(axis = -1)
me = numexpr.evaluate('sum(A*B,2)')
np.allclose(out,me)
Out[29]: True


来源:https://stackoverflow.com/questions/63667395/how-to-speed-up-multiply-and-sum-operations-in-numpy

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!