Speeding up element-wise array multiplication in python

后端 未结 5 1584
小鲜肉
小鲜肉 2020-12-08 11:32

I have been playing around with numba and numexpr trying to speed up a simple element-wise matrix multiplication. I have not been able to get better results, they both are b

5条回答
  •  旧巷少年郎
    2020-12-08 12:14

    What about using fortran and ctypes?

    elementwise.F90:

    subroutine elementwise( a, b, c, M, N ) bind(c, name='elementwise')
      use iso_c_binding, only: c_float, c_int
    
      integer(c_int),intent(in) :: M, N
      real(c_float), intent(in) :: a(M, N), b(M, N)
      real(c_float), intent(out):: c(M, N)
    
      integer :: i,j
    
      forall (i=1:M,j=1:N)
        c(i,j) = a(i,j) * b(i,j)
      end forall
    
    end subroutine 
    

    elementwise.py:

    from ctypes import CDLL, POINTER, c_int, c_float
    import numpy as np
    import time
    
    fortran = CDLL('./elementwise.so')
    fortran.elementwise.argtypes = [ POINTER(c_float), 
                                     POINTER(c_float), 
                                     POINTER(c_float),
                                     POINTER(c_int),
                                     POINTER(c_int) ]
    
    # Setup    
    M=10
    N=5000000
    
    a = np.empty((M,N), dtype=c_float)
    b = np.empty((M,N), dtype=c_float)
    c = np.empty((M,N), dtype=c_float)
    
    a[:] = np.random.rand(M,N)
    b[:] = np.random.rand(M,N)
    
    
    # Fortran call
    start = time.time()
    fortran.elementwise( a.ctypes.data_as(POINTER(c_float)), 
                         b.ctypes.data_as(POINTER(c_float)), 
                         c.ctypes.data_as(POINTER(c_float)), 
                         c_int(M), c_int(N) )
    stop = time.time()
    print 'Fortran took ',stop - start,'seconds'
    
    # Numpy
    start = time.time()
    c = np.multiply(a,b)
    stop = time.time()
    print 'Numpy took ',stop - start,'seconds'
    

    I compiled the Fortran file using

    gfortran -O3 -funroll-loops -ffast-math -floop-strip-mine -shared -fPIC \
             -o elementwise.so elementwise.F90
    

    The output yields a speed-up of ~10%:

     $ python elementwise.py 
    Fortran took  0.213667869568 seconds
    Numpy took  0.230120897293 seconds
     $ python elementwise.py 
    Fortran took  0.209784984589 seconds
    Numpy took  0.231616973877 seconds
     $ python elementwise.py 
    Fortran took  0.214708089828 seconds
    Numpy took  0.25369310379 seconds
    

提交回复
热议问题