Create a 2D array from another array and its indices with NumPy

后端 未结 2 2029
悲&欢浪女
悲&欢浪女 2020-12-07 03:26

Given an array:

arr = np.array([[1, 3, 7], [4, 9, 8]]); arr

array([[1, 3, 7],
       [4, 9, 8]])

And given its indices:

np         


        
2条回答
  •  醉梦人生
    2020-12-07 03:40

    Using array-initialization and then broadcasted-assignment for assigning indices and the array values in subsequent steps -

    def indices_merged_arr(arr):
        m,n = arr.shape
        I,J = np.ogrid[:m,:n]
        out = np.empty((m,n,3), dtype=arr.dtype)
        out[...,0] = I
        out[...,1] = J
        out[...,2] = arr
        out.shape = (-1,3)
        return out
    

    Note that we are avoiding the use of np.indices(arr.shape), which could have slowed things down.

    Sample run -

    In [10]: arr = np.array([[1, 3, 7], [4, 9, 8]])
    
    In [11]: indices_merged_arr(arr)
    Out[11]: 
    array([[0, 0, 1],
           [0, 1, 3],
           [0, 2, 7],
           [1, 0, 4],
           [1, 1, 9],
           [1, 2, 8]])
    

    Performance

    arr = np.random.randn(100000, 2)
    
    %timeit df = pd.DataFrame(np.hstack((np.indices(arr.shape).reshape(2, arr.size).T,\
                                    arr.reshape(-1, 1))), columns=['x', 'y', 'value'])
    100 loops, best of 3: 4.97 ms per loop
    
    %timeit pd.DataFrame(indices_merged_arr_divakar(arr), columns=['x', 'y', 'value'])
    100 loops, best of 3: 3.82 ms per loop
    
    %timeit pd.DataFrame(indices_merged_arr_eric(arr), columns=['x', 'y', 'value'], dtype=np.float32)
    100 loops, best of 3: 5.59 ms per loop
    

    Note: Timings include conversion to pandas dataframe, that is the eventual use case for this solution.

提交回复
热议问题