Access multiple elements of an array

后端未结

关注

 3  2141

Is there a way to get array elements in one operation for known rows and columns of those elements? In each row I would like to access elements from col_start to col_end (ea

相关标签:

3条回答

悲哀的现实

2020-12-06 15:28

A = np.arange(40).reshape(4,10)*.1
startend = [[2,5],[3,6],[4,7],[5,8]]
index_list = [np.arange(v[0],v[1]) + i*A.shape[1] 
                 for i,v in enumerate(startend)]
# [array([2, 3, 4]), array([13, 14, 15]), array([24, 25, 26]), array([35, 36, 37])]
A.flat[index_list]

producing

array([[ 0.2,  0.3,  0.4],
       [ 1.3,  1.4,  1.5],
       [ 2.4,  2.5,  2.6],
       [ 3.5,  3.6,  3.7]])

This still has an iteration, but it's a rather basic one over a list. I'm indexing the flattened, 1d, version of A. np.take(A, index_list) also works.

If the row intervals differ in size, I can use np.r_ to concatenate them. It's not absolutely necessary, but it is a convenience when building up indices from multiple intervals and values.

A.flat[np.r_[tuple(index_list)]]
# array([ 0.2,  0.3,  0.4,  1.3,  1.4,  1.5,  2.4,  2.5,  2.6,  3.5,  3.6, 3.7])

The idx that ajcr uses can be used without choose:

idx = [np.arange(v[0], v[1]) for i,v in enumerate(startend)]
A[np.arange(A.shape[0])[:,None], idx]

idx is like my index_list except that it doesn't add the row length.

np.array(idx)

array([[2, 3, 4],
       [3, 4, 5],
       [4, 5, 6],
       [5, 6, 7]])

Since each arange has the same length, idx can be generated without iteration:

col_start = np.array([2,3,4,5])
idx = col_start[:,None] + np.arange(3)

The first index is a column array that broadcasts to match this idx.

np.arange(A.shape[0])[:,None] 
array([[0],
       [1],
       [2],
       [3]])

With this A and idx I get the following timings:

In [515]: timeit np.choose(idx,A.T[:,:,None])
10000 loops, best of 3: 30.8 µs per loop

In [516]: timeit A[np.arange(A.shape[0])[:,None],idx]
100000 loops, best of 3: 10.8 µs per loop

In [517]: timeit A.flat[idx+np.arange(A.shape[0])[:,None]*A.shape[1]]
10000 loops, best of 3: 24.9 µs per loop

The flat indexing is faster, but calculating the fancier index takes up some time.

For large arrays, the speed of flat indexing dominates.

A=np.arange(4000).reshape(40,100)*.1
col_start=np.arange(20,60)
idx=col_start[:,None]+np.arange(30)

In [536]: timeit A[np.arange(A.shape[0])[:,None],idx]
10000 loops, best of 3: 108 µs per loop

In [537]: timeit A.flat[idx+np.arange(A.shape[0])[:,None]*A.shape[1]]
10000 loops, best of 3: 59.4 µs per loop

The np.choose method runs into a hardcoded limit: Need between 2 and (32) array objects (inclusive).

What out of bounds idx?

col_start=np.array([2,4,6,8])
idx=col_start[:,None]+np.arange(3)
A[np.arange(A.shape[0])[:,None], idx]

produces an error because the last idx value is 10, too large.

You could clip idx

idx=idx.clip(0,A.shape[1]-1)

producing duplicate values in the last row

[ 3.8,  3.9,  3.9]

You could also pad A before indexing. See np.pad for more options.

np.pad(A,((0,0),(0,2)),'edge')[np.arange(A.shape[0])[:,None], idx]

Another option is to remove out of bounds values. idx would then become a ragged list of lists (or array of lists). The flat approach can handle this, though the result will not be a matrix.

startend = [[2,5],[4,7],[6,9],[8,10]]
index_list = [np.arange(v[0],v[1]) + i*A.shape[1] 
                 for i,v in enumerate(startend)]
# [array([2, 3, 4]), array([14, 15, 16]), array([26, 27, 28]), array([38, 39])]

A.flat[np.r_[tuple(index_list)]]
# array([ 0.2,  0.3,  0.4,  1.4,  1.5,  1.6,  2.6,  2.7,  2.8,  3.8,  3.9])

0 讨论(0)

暖寄归人

2020-12-06 15:29

I think you're looking for something like the below. I'm not sure what you want to do with them when you access them though.

indexes = [(4,6), (0,2), (2,4), (8, 10)]
arr = [
    [ . . . . | | | . . . . . ],
    [ | | | . . . . . . . . . ],
    [ . . | | | . . . . . . . ],
    [ . . . . . . . . | | | . ]
]

for x in zip(indexes, arr):
    index = x[0]
    row = x[1]
    print row[index[0]:index[1]+1]

0 讨论(0)

天涯浪人

2020-12-06 15:44
You can use np.choose.

Here's an example NumPy array arr:
```
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20]])
```
Let's say we want to pick the values [1, 2, 3] from the first row, [11, 12, 13] from the second row and [17, 18, 19] from the third row.

In other words, we'll pick out the indices from each row of arr as shown in an array idx:
```
array([[1, 2, 3],
       [4, 5, 6],
       [3, 4, 5]])
```
Then using np.choose:
```
>>> np.choose(idx, arr.T[:,:,np.newaxis])
array([[ 1,  2,  3],
       [11, 12, 13],
       [17, 18, 19]])
```
To explain what just happened: arr.T[:,:,np.newaxis] meant that arr was temporarily viewed as 3D array with shape (7, 3, 1). You can imagine this as 3D array where each column of the original arr is now a 2D column vector with three values. The 3D array looks a bit like this:
```
#  0       1       2       3       4       5       6
[[ 0]   [[ 1]   [[ 2]   [[ 3]   [[ 4]   [[ 5]   [[ 6]   # choose values from 1, 2, 3
 [ 7]    [ 8]    [ 9]    [10]    [11]    [12]    [13]   # choose values from 4, 5, 6
 [14]]   [15]]   [16]]   [17]]   [18]]   [19]]   [20]]  # choose values from 3, 4, 5
```
To get the zeroth row of the output array, choose selects the zeroth element from the 2D column at index 1, the zeroth element from the 2D column at index 2, and the zeroth element from the 2D column at index 3.

To get the first row of the output array, choose selects the first element from the 2D column at index 4, the first element from the 2D column at index 5, ... and so on.
0 讨论(0)
发布评论:

提交评论
- 加载中...