Why is numba faster than numpy here?

后端未结

关注

 4  1593

庸人自扰 2020-12-24 02:26

I can\'t figure out why numba is beating numpy here (over 3x). Did I make some fundamental error in how I am benchmarking here? Seems like the perfect situation for numpy,

4条回答

难免孤独 (楼主)

2020-12-24 02:49
I think this question highlights (somewhat) the limitations of calling out to precompiled functions from a higher level language. Suppose in C++ you write something like:
```
for (int i = 0; i != N; ++i) a[i] = b[i] + c[i] + 2 * d[i];
```
The compiler sees all this at compile time, the whole expression. It can do a lot of really intelligent things here, including optimizing out temporaries (and loop unrolling).

In python however, consider what's happening: when you use numpy each ''+'' uses operator overloading on the np array types (which are just thin wrappers around contiguous blocks of memory, i.e. arrays in the low level sense), and calls out to a fortran (or C++) function which does the addition super fast. But it just does one addition, and spits out a temporary.

We can see that in some way, while numpy is awesome and convenient and pretty fast, it is slowing things down because while it seems like it is calling into a fast compiled language for the hard work, the compiler doesn't get to see the whole program, it's just fed isolated little bits. And this is hugely detrimental to a compiler, especially modern compilers which are very intelligent and can retire multiple instructions per cycle when the code is well written.

Numba on the other hand, used a jit. So, at runtime it can figure out that the temporaries are not needed, and optimize them away. Basically, Numba has a chance to have the program compiled as a whole, numpy can only call small atomic blocks which themselves have been pre-compiled.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...