Convert 1D array with coordinates into 2D array in numpy

*爱你&永不变心* 提交于 2019-12-10 22:14:36

问题


I have an array of values arr with shape (N,) and an array of coordinates coords with shape (N,2). I want to represent this in an (M,M) array grid such that grid takes the value 0 at coordinates that are not in coords, and for the coordinates that are included it should store the sum of all values in arr that have that coordinate. So if M=3, arr = np.arange(4)+1, and coords = np.array([[0,0,1,2],[0,0,2,2]]) then grid should be:

array([[3., 0., 0.],
       [0., 0., 3.],
       [0., 0., 4.]])

The reason this is nontrivial is that I need to be able to repeat this step many times and the values in arr change each time, and so can the coordinates. Ideally I am looking for a vectorized solution. I suspect that I might be able to use np.where somehow but it's not immediately obvious how.

Timing the solutions

I have timed the solutions present at this time and it appear that the accumulator method is slightly faster than the sparse matrix method, with the second accumulation method being the slowest for the reasons explained in the comments:

%timeit for x in range(100): accumulate_arr(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): accumulate_arr_v2(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): sparse.coo_matrix((np.random.normal(0,1,10000),np.random.randint(100,size=(2,10000))),(100,100)).A
47.3 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
103 ms ± 255 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
48.2 ms ± 36 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

回答1:


With np.bincount -

def accumulate_arr(coords, arr):
    # Get output array shape
    m,n = coords.max(1)+1

    # Get linear indices to be used as IDs with bincount
    lidx = np.ravel_multi_index(coords, (m,n))
    # Or lidx = coords[0]*(coords[1].max()+1) + coords[1]

    # Accumulate arr with IDs from lidx
    return np.bincount(lidx,arr,minlength=m*n).reshape(m,n)

Sample run -

In [58]: arr
Out[58]: array([1, 2, 3, 4])

In [59]: coords
Out[59]: 
array([[0, 0, 1, 2],
       [0, 0, 2, 2]])

In [60]: accumulate_arr(coords, arr)
Out[60]: 
array([[3., 0., 0.],
       [0., 0., 3.],
       [0., 0., 4.]])

Another with np.add.at on similar lines and might be easier to follow -

def accumulate_arr_v2(coords, arr):
    m,n = coords.max(1)+1
    out = np.zeros((m,n), dtype=arr.dtype)
    np.add.at(out, tuple(coords), arr)
    return out



回答2:


One way would be to create a sparse.coo_matrix and convert that to dense:

from scipy import sparse
sparse.coo_matrix((arr,coords),(M,M)).A
# array([[3, 0, 0],
#        [0, 0, 3],
#        [0, 0, 4]])


来源:https://stackoverflow.com/questions/56462192/convert-1d-array-with-coordinates-into-2d-array-in-numpy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!