Combining NumPy arrays

前端 未结 5 1319
北荒
北荒 2020-12-17 22:55

I have two 20x100x3 NumPy arrays which I want to combine into a 40 x 100 x 3 array, that is, just add more lines to the array. I am confused by which function I want: is it

相关标签:
5条回答
  • 2020-12-17 22:57

    I tried a little benchmark between r_ and vstack and the result is very interesting:

    import numpy as np
    
    NCOLS = 10
    NROWS = 2
    NMATRICES = 10000
    
    def mergeR(matrices):
        result = np.zeros([0, NCOLS])
    
        for m in matrices:
            result = np.r_[ result, m]
    
    def mergeVstack(matrices):
        result = np.vstack(matrices)
    
    def main():
        matrices = tuple( np.random.random([NROWS, NCOLS]) for i in xrange(NMATRICES) )
        mergeR(matrices)
        mergeVstack(matrices)
    
        return 0
    
    if __name__ == '__main__':
        main()
    

    Then I ran profiler:

    python -m cProfile -s cumulative np_merge_benchmark.py
    

    and the results:

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    ...
         1    0.579    0.579    4.139    4.139 np_merge_benchmark.py:21(mergeR)
    ...
         1    0.000    0.000    0.054    0.054 np_merge_benchmark.py:27(mergeVstack)
    

    So the vstack way is 77x faster!

    0 讨论(0)
  • 2020-12-17 22:59

    One of the best ways of learning is experimenting, but I would say you want np.vstack although there are other ways of doing the same thing:

    a = np.ones((20,100,3))
    b = np.vstack((a,a)) 
    
    print b.shape # (40,100,3)
    

    or

    b = np.concatenate((a,a),axis=0)
    

    EDIT

    Just as a note, on my machine for the sized arrays in the OP's question, I find that np.concatenate is about 2x faster than np.vstack

    In [172]: a = np.random.normal(size=(20,100,3))
    
    In [173]: c = np.random.normal(size=(20,100,3))
    
    In [174]: %timeit b = np.concatenate((a,c),axis=0)
    100000 loops, best of 3: 13.3 us per loop
    
    In [175]: %timeit b = np.vstack((a,c))
    10000 loops, best of 3: 26.1 us per loop
    
    0 讨论(0)
  • 2020-12-17 23:11

    I believe it's vstack you want

    p=array_2
    q=array_2
    p=numpy.vstack([p,q])
    
    0 讨论(0)
  • 2020-12-17 23:19

    Might be worth mentioning that

        np.concatenate((a1, a2, ...), axis=0) 
    

    is the general form and vstack and hstack are specific cases. I find it easiest to just know which dimension I want to stack over and provide that as the argument to np.concatenate.

    0 讨论(0)
  • 2020-12-17 23:22

    By the way, there is also r_:

    >>> from scipy import *
    >>> a = rand(20,100,3)
    >>> b = rand(20,100,3)
    >>> a.shape
    (20, 100, 3)
    >>> b.shape
    (20, 100, 3)
    >>> r_[a,b].shape
    (40, 100, 3)
    >>> (r_[a,b] == vstack([a,b])).all()
    True
    
    0 讨论(0)
提交回复
热议问题