Fastest way to grow a numpy numeric array

后端 未结 5 2136
后悔当初
后悔当初 2020-11-27 12:01

Requirements:

  • I need to grow an array arbitrarily large from data.
  • I can guess the size (roughly 100-200) with no guarantees that the array will fit
5条回答
  •  被撕碎了的回忆
    2020-11-27 12:44

    Using the class declarations in Owen's post, here is a revised timing with some effect of the finalize.

    In short, I find class C to provide an implementation that is over 60x faster than the method in the original post. (apologies for the wall of text)

    The file I used:

    #!/usr/bin/python
    import cProfile
    import numpy as np
    
    # ... class declarations here ...
    
    def test_class(f):
        x = f()
        for i in xrange(100000):
            x.update([i])
        for i in xrange(1000):
            x.finalize()
    
    for x in 'ABC':
        cProfile.run('test_class(%s)' % x)
    

    Now, the resulting timings:

    A:

         903005 function calls in 16.049 seconds
    
    Ordered by: standard name
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000   16.049   16.049 :1()
    100000    0.139    0.000    1.888    0.000 fromnumeric.py:1043(ravel)
      1000    0.001    0.000    0.003    0.000 fromnumeric.py:107(reshape)
    100000    0.322    0.000   14.424    0.000 function_base.py:3466(append)
    100000    0.102    0.000    1.623    0.000 numeric.py:216(asarray)
    100000    0.121    0.000    0.298    0.000 numeric.py:286(asanyarray)
      1000    0.002    0.000    0.004    0.000 test.py:12(finalize)
         1    0.146    0.146   16.049   16.049 test.py:50(test_class)
         1    0.000    0.000    0.000    0.000 test.py:6(__init__)
    100000    1.475    0.000   15.899    0.000 test.py:9(update)
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    100000    0.126    0.000    0.126    0.000 {method 'ravel' of 'numpy.ndarray' objects}
      1000    0.002    0.000    0.002    0.000 {method 'reshape' of 'numpy.ndarray' objects}
    200001    1.698    0.000    1.698    0.000 {numpy.core.multiarray.array}
    100000   11.915    0.000   11.915    0.000 {numpy.core.multiarray.concatenate}
    

    B:

         208004 function calls in 16.885 seconds
    
    Ordered by: standard name
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.001    0.001   16.885   16.885 :1()
      1000    0.025    0.000   16.508    0.017 fromnumeric.py:107(reshape)
      1000    0.013    0.000   16.483    0.016 fromnumeric.py:32(_wrapit)
      1000    0.007    0.000   16.445    0.016 numeric.py:216(asarray)
         1    0.000    0.000    0.000    0.000 test.py:16(__init__)
    100000    0.068    0.000    0.080    0.000 test.py:19(update)
      1000    0.012    0.000   16.520    0.017 test.py:23(finalize)
         1    0.284    0.284   16.883   16.883 test.py:50(test_class)
      1000    0.005    0.000    0.005    0.000 {getattr}
      1000    0.001    0.000    0.001    0.000 {len}
    100000    0.012    0.000    0.012    0.000 {method 'append' of 'list' objects}
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      1000    0.020    0.000    0.020    0.000 {method 'reshape' of 'numpy.ndarray' objects}
      1000   16.438    0.016   16.438    0.016 {numpy.core.multiarray.array}
    

    C:

         204010 function calls in 0.244 seconds
    
    Ordered by: standard name
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    0.244    0.244 :1()
      1000    0.001    0.000    0.003    0.000 fromnumeric.py:107(reshape)
         1    0.000    0.000    0.000    0.000 test.py:27(__init__)
    100000    0.082    0.000    0.170    0.000 test.py:32(update)
    100000    0.087    0.000    0.088    0.000 test.py:36(add)
      1000    0.002    0.000    0.005    0.000 test.py:46(finalize)
         1    0.068    0.068    0.243    0.243 test.py:50(test_class)
      1000    0.000    0.000    0.000    0.000 {len}
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      1000    0.002    0.000    0.002    0.000 {method 'reshape' of 'numpy.ndarray' objects}
         6    0.001    0.000    0.001    0.000 {numpy.core.multiarray.zeros}
    

    Class A is destroyed by the updates, class B is destroyed by the finalizes. Class C is robust in the face of both of them.

提交回复
热议问题