Access multiple elements of an array

后端 未结 3 2133
心在旅途
心在旅途 2020-12-06 15:08

Is there a way to get array elements in one operation for known rows and columns of those elements? In each row I would like to access elements from col_start to col_end (ea

相关标签:
3条回答
  • 2020-12-06 15:28
    A = np.arange(40).reshape(4,10)*.1
    startend = [[2,5],[3,6],[4,7],[5,8]]
    index_list = [np.arange(v[0],v[1]) + i*A.shape[1] 
                     for i,v in enumerate(startend)]
    # [array([2, 3, 4]), array([13, 14, 15]), array([24, 25, 26]), array([35, 36, 37])]
    A.flat[index_list]
    

    producing

    array([[ 0.2,  0.3,  0.4],
           [ 1.3,  1.4,  1.5],
           [ 2.4,  2.5,  2.6],
           [ 3.5,  3.6,  3.7]])
    

    This still has an iteration, but it's a rather basic one over a list. I'm indexing the flattened, 1d, version of A. np.take(A, index_list) also works.

    If the row intervals differ in size, I can use np.r_ to concatenate them. It's not absolutely necessary, but it is a convenience when building up indices from multiple intervals and values.

    A.flat[np.r_[tuple(index_list)]]
    # array([ 0.2,  0.3,  0.4,  1.3,  1.4,  1.5,  2.4,  2.5,  2.6,  3.5,  3.6, 3.7])
    

    The idx that ajcr uses can be used without choose:

    idx = [np.arange(v[0], v[1]) for i,v in enumerate(startend)]
    A[np.arange(A.shape[0])[:,None], idx]
    

    idx is like my index_list except that it doesn't add the row length.

    np.array(idx)
    
    array([[2, 3, 4],
           [3, 4, 5],
           [4, 5, 6],
           [5, 6, 7]])
    

    Since each arange has the same length, idx can be generated without iteration:

    col_start = np.array([2,3,4,5])
    idx = col_start[:,None] + np.arange(3)
    

    The first index is a column array that broadcasts to match this idx.

    np.arange(A.shape[0])[:,None] 
    array([[0],
           [1],
           [2],
           [3]])
    

    With this A and idx I get the following timings:

    In [515]: timeit np.choose(idx,A.T[:,:,None])
    10000 loops, best of 3: 30.8 µs per loop
    
    In [516]: timeit A[np.arange(A.shape[0])[:,None],idx]
    100000 loops, best of 3: 10.8 µs per loop
    
    In [517]: timeit A.flat[idx+np.arange(A.shape[0])[:,None]*A.shape[1]]
    10000 loops, best of 3: 24.9 µs per loop
    

    The flat indexing is faster, but calculating the fancier index takes up some time.

    For large arrays, the speed of flat indexing dominates.

    A=np.arange(4000).reshape(40,100)*.1
    col_start=np.arange(20,60)
    idx=col_start[:,None]+np.arange(30)
    
    In [536]: timeit A[np.arange(A.shape[0])[:,None],idx]
    10000 loops, best of 3: 108 µs per loop
    
    In [537]: timeit A.flat[idx+np.arange(A.shape[0])[:,None]*A.shape[1]]
    10000 loops, best of 3: 59.4 µs per loop
    

    The np.choose method runs into a hardcoded limit: Need between 2 and (32) array objects (inclusive).


    What out of bounds idx?

    col_start=np.array([2,4,6,8])
    idx=col_start[:,None]+np.arange(3)
    A[np.arange(A.shape[0])[:,None], idx]
    

    produces an error because the last idx value is 10, too large.

    You could clip idx

    idx=idx.clip(0,A.shape[1]-1)
    

    producing duplicate values in the last row

    [ 3.8,  3.9,  3.9]
    

    You could also pad A before indexing. See np.pad for more options.

    np.pad(A,((0,0),(0,2)),'edge')[np.arange(A.shape[0])[:,None], idx]
    

    Another option is to remove out of bounds values. idx would then become a ragged list of lists (or array of lists). The flat approach can handle this, though the result will not be a matrix.

    startend = [[2,5],[4,7],[6,9],[8,10]]
    index_list = [np.arange(v[0],v[1]) + i*A.shape[1] 
                     for i,v in enumerate(startend)]
    # [array([2, 3, 4]), array([14, 15, 16]), array([26, 27, 28]), array([38, 39])]
    
    A.flat[np.r_[tuple(index_list)]]
    # array([ 0.2,  0.3,  0.4,  1.4,  1.5,  1.6,  2.6,  2.7,  2.8,  3.8,  3.9])
    
    0 讨论(0)
  • 2020-12-06 15:29

    I think you're looking for something like the below. I'm not sure what you want to do with them when you access them though.

    indexes = [(4,6), (0,2), (2,4), (8, 10)]
    arr = [
        [ . . . . | | | . . . . . ],
        [ | | | . . . . . . . . . ],
        [ . . | | | . . . . . . . ],
        [ . . . . . . . . | | | . ]
    ]
    
    for x in zip(indexes, arr):
        index = x[0]
        row = x[1]
        print row[index[0]:index[1]+1]
    
    0 讨论(0)
  • 2020-12-06 15:44

    You can use np.choose.

    Here's an example NumPy array arr:

    array([[ 0,  1,  2,  3,  4,  5,  6],
           [ 7,  8,  9, 10, 11, 12, 13],
           [14, 15, 16, 17, 18, 19, 20]])
    

    Let's say we want to pick the values [1, 2, 3] from the first row, [11, 12, 13] from the second row and [17, 18, 19] from the third row.

    In other words, we'll pick out the indices from each row of arr as shown in an array idx:

    array([[1, 2, 3],
           [4, 5, 6],
           [3, 4, 5]])
    

    Then using np.choose:

    >>> np.choose(idx, arr.T[:,:,np.newaxis])
    array([[ 1,  2,  3],
           [11, 12, 13],
           [17, 18, 19]])
    

    To explain what just happened: arr.T[:,:,np.newaxis] meant that arr was temporarily viewed as 3D array with shape (7, 3, 1). You can imagine this as 3D array where each column of the original arr is now a 2D column vector with three values. The 3D array looks a bit like this:

    #  0       1       2       3       4       5       6
    [[ 0]   [[ 1]   [[ 2]   [[ 3]   [[ 4]   [[ 5]   [[ 6]   # choose values from 1, 2, 3
     [ 7]    [ 8]    [ 9]    [10]    [11]    [12]    [13]   # choose values from 4, 5, 6
     [14]]   [15]]   [16]]   [17]]   [18]]   [19]]   [20]]  # choose values from 3, 4, 5
    

    To get the zeroth row of the output array, choose selects the zeroth element from the 2D column at index 1, the zeroth element from the 2D column at index 2, and the zeroth element from the 2D column at index 3.

    To get the first row of the output array, choose selects the first element from the 2D column at index 4, the first element from the 2D column at index 5, ... and so on.

    0 讨论(0)
提交回复
热议问题