Count unique elements row wise in an ndarray

前端未结

关注

 2  710

日久生厌 2020-12-19 20:50

An extension to this question. In addition to having the unique elements row-wise, I want to have a similarly shaped array that gives me the count of unique values. For exam

2条回答

独厮守ぢ (楼主)

2020-12-19 21:37

This method does the same as np.unique for each row, by sorting each row and getting the length of consecutive equal values. This has complexity O(NMlog(M)) which is better than running unique on the whole array, since that has complexity O(NM(log(NM))

def row_unique_count(a):                                    
     args = np.argsort(a)
     unique = a[np.indices(a.shape)[0], args]
     changes = np.pad(unique[:, 1:] != unique[:, :-1], ((0, 0), (1, 0)), mode="constant", constant_values=1)
     idxs = np.nonzero(changes)
     tmp = np.hstack((idxs[-1], 0))
     counts = np.where(tmp[1:], np.diff(tmp), a.shape[-1]-tmp[:-1])
     count_array = np.zeros(a.shape, dtype="int")
     count_array[(idxs[0], args[idxs])] = counts
     return count_array

Running times:

In [162]: b = np.random.random(size=100000).reshape((100, 1000))

In [163]: %timeit row_unique_count(b)
100 loops, best of 3: 10.4 ms per loop

In [164]: %timeit count_unique_by_row(b)
100 loops, best of 3: 19.4 ms per loop

In [165]: assert np.all(row_unique_count(b) == count_unique_by_row(b))

0 讨论(0)

查看其它2个回答