Why is numba faster than numpy here?

后端 未结 4 1561
庸人自扰
庸人自扰 2020-12-24 02:26

I can\'t figure out why numba is beating numpy here (over 3x). Did I make some fundamental error in how I am benchmarking here? Seems like the perfect situation for numpy,

4条回答
  •  北海茫月
    2020-12-24 02:48

    Instead of cluttering the original question further, I'll add some more stuff here in response to Jeff, Jaime, Veedrac:

    def proc_numpy2(x,y,z):
       np.subtract( np.multiply(x,2), np.multiply(y,55),out=x)
       np.add( x, np.multiply(y,2),out=y)
       np.add(x,np.add(y,99),out=z) 
       np.multiply(z,np.subtract(z,.88),out=z)
       return z
    
    def proc_numpy3(x,y,z):
       x *= 2
       x -= y*55
       y *= 2
       y += x
       z = x + y
       z += 99
       z *= (z-.88) 
       return z
    

    My machine seems to be running a tad faster today than yesterday so here they are in comparison to proc_numpy (proc_numba is timing the same as before)

    In [611]: %timeit proc_numpy(x,y,z)
    10000 loops, best of 3: 103 µs per loop
    
    In [612]: %timeit proc_numpy2(x,y,z)
    10000 loops, best of 3: 92.5 µs per loop
    
    In [613]: %timeit proc_numpy3(x,y,z)
    10000 loops, best of 3: 85.1 µs per loop
    

    Note that as I was writing proc_numpy2/3 that I started seeing some side effects so I made copies of x,y,z and passed the copies instead of re-using x,y,z. Also, the different functions sometimes had slight differences in precision, so some of the them didn't pass the equality tests but if you diff them, they are really close. I assume that is due to creating or (not creating) temp variables. E.g.:

    In [458]: (res_numpy2 - res_numba)[:12]
    Out[458]: 
    array([ -7.27595761e-12,   0.00000000e+00,   0.00000000e+00,
             0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
             0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
             0.00000000e+00,  -7.27595761e-12,   0.00000000e+00])
    

    Also, it's pretty minor (about 10 µs) but using float literals (55. instead of 55) will also save a little time for numpy but doesn't help numba.

提交回复
热议问题