Why is numpy.array so slow?

前端 未结 4 1340
南笙
南笙 2020-11-27 21:54

I am baffled by this

def main():
    for i in xrange(2560000):
        a = [0.0, 0.0, 0.0]

main()

$ time python test.py

real     0m0.793s
<
4条回答
  •  情歌与酒
    2020-11-27 22:52

    Late answer, but could be important for other viewers.

    This problem has been considered in the kwant project as well. Indeed small arrays are not optimized in numpy and quite frequently small arrays are exactly what you need.

    In this regard they created a substitute for small arrays which behaves and co-exists with the numpy arrays (any non-implemented operation in the new data-type is processed by numpy).

    You should look into this project:
    https://pypi.python.org/pypi/tinyarray/1.0.5
    which main purpose is to behave nicely for small arrays. Of course some of the more fancy things you can do with numpy is not supported by this. But numerics seems to be your request.

    I have made some small tests:

    python

    I have added numpy import to get the load time correct

    import numpy
    
    def main():
        for i in xrange(2560000):
            a = [0.0, 0.0, 0.0]
    
    main()
    

    numpy

    import numpy
    
    def main():
        for i in xrange(2560000):
            a = numpy.array([0.0, 0.0, 0.0])
    
    main()
    

    numpy-zero

    import numpy
    
    def main():
        for i in xrange(2560000):
            a = numpy.zeros((3,1))
    
    main()
    

    tinyarray

    import numpy,tinyarray
    
    def main():
        for i in xrange(2560000):
            a = tinyarray.array([0.0, 0.0, 0.0])
    
    main()
    

    tinyarray-zero

    import numpy,tinyarray
    
    def main():
        for i in xrange(2560000):
            a = tinyarray.zeros((3,1))
    
    main()
    

    I ran this:

    for f in python numpy numpy_zero tiny tiny_zero ; do 
       echo $f 
       for i in `seq 5` ; do 
          time python ${f}_test.py
       done 
     done
    

    And got:

    python
    python ${f}_test.py  0.31s user 0.02s system 99% cpu 0.339 total
    python ${f}_test.py  0.29s user 0.03s system 98% cpu 0.328 total
    python ${f}_test.py  0.33s user 0.01s system 98% cpu 0.345 total
    python ${f}_test.py  0.31s user 0.01s system 98% cpu 0.325 total
    python ${f}_test.py  0.32s user 0.00s system 98% cpu 0.326 total
    numpy
    python ${f}_test.py  2.79s user 0.01s system 99% cpu 2.812 total
    python ${f}_test.py  2.80s user 0.02s system 99% cpu 2.832 total
    python ${f}_test.py  3.01s user 0.02s system 99% cpu 3.033 total
    python ${f}_test.py  2.99s user 0.01s system 99% cpu 3.012 total
    python ${f}_test.py  3.20s user 0.01s system 99% cpu 3.221 total
    numpy_zero
    python ${f}_test.py  1.04s user 0.02s system 99% cpu 1.075 total
    python ${f}_test.py  1.08s user 0.02s system 99% cpu 1.106 total
    python ${f}_test.py  1.04s user 0.02s system 99% cpu 1.065 total
    python ${f}_test.py  1.03s user 0.02s system 99% cpu 1.059 total
    python ${f}_test.py  1.05s user 0.01s system 99% cpu 1.064 total
    tiny
    python ${f}_test.py  0.93s user 0.02s system 99% cpu 0.955 total
    python ${f}_test.py  0.98s user 0.01s system 99% cpu 0.993 total
    python ${f}_test.py  0.93s user 0.02s system 99% cpu 0.953 total
    python ${f}_test.py  0.92s user 0.02s system 99% cpu 0.944 total
    python ${f}_test.py  0.96s user 0.01s system 99% cpu 0.978 total
    tiny_zero
    python ${f}_test.py  0.71s user 0.03s system 99% cpu 0.739 total
    python ${f}_test.py  0.68s user 0.02s system 99% cpu 0.711 total
    python ${f}_test.py  0.70s user 0.01s system 99% cpu 0.721 total
    python ${f}_test.py  0.70s user 0.02s system 99% cpu 0.721 total
    python ${f}_test.py  0.67s user 0.01s system 99% cpu 0.687 total
    

    Now these tests are (as already pointed out) not the best tests. However, they still show that tinyarray is better suited for small arrays.
    Another fact is that the most common operations should be faster with tinyarray. So it might have better benefits of usage than just data creations.

    I have never tried it in a fully fledged project, but the kwant project is using it

提交回复
热议问题