Can numpy bincount work with 2D arrays?

前端 未结 2 1271
生来不讨喜
生来不讨喜 2020-12-03 17:21

I am seeing behaviour with numpy bincount that I cannot make sense of. I want to bin the values in a 2D array in a row-wise manner and see the behaviour below. Why would i

2条回答
  •  我在风中等你
    2020-12-03 17:48

    As @DSM has already mentioned, bincount of a 2d array cannot be done without knowing the maximum value of the array, because it would mean an inconsistency of array sizes.

    But thanks to the power of numpy's indexing, it was fairly easy to make a faster implementation of 2d bincount, as it doesn't use concatenation or anything.

    def bincount2d(arr, bins=None):
        if bins is None:
            bins = np.max(arr) + 1
        count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
        indexing = np.arange(len(arr))
        for col in arr.T:
            count[indexing, col] += 1
        return count
    
    
    t = np.array([[1,2,3],[4,5,6],[3,2,2]], dtype=np.int64)
    print(bincount2d(t))
    

    P.S.

    This:

    t = np.empty(shape=[10000, 100], dtype=np.int64)
    s = time.time()
    bincount2d(t)
    e = time.time()
    print(e - s)
    

    gives ~2 times faster result, than this:

    t = np.empty(shape=[100, 10000], dtype=np.int64)
    s = time.time()
    bincount2d(t)
    e = time.time()
    print(e - s)
    

    because of the for loop iterating over columns. So, it's better to transpose your 2d array, if shape[0] < shape[1].

    UPD

    Better than this can't be done (using python alone, I mean):

    def bincount2d(arr, bins=None):
        if bins is None:
            bins = np.max(arr) + 1
        count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
        indexing = (np.ones_like(arr).T * np.arange(len(arr))).T
        np.add.at(count, (indexing, arr), 1)
    
        return count
    

提交回复
热议问题