Pandas has a widely-used groupby facility to split up a DataFrame based on a corresponding mapping, from which you can apply a calculation on each subgroup and recombine the
@klim's sparse matrix solution would at first sight appear to be tied to summation. We can, however, use it in the general case by converting between the csr and csc formats:
Let's look at a small example:
>>> m, n = 3, 8
>>> idx = np.random.randint(0, m, (n,))
>>> data = np.arange(n)
>>>
>>> M = sparse.csr_matrix((data, idx, np.arange(n+1)), (n, m))
>>>
>>> idx
array([0, 2, 2, 1, 1, 2, 2, 0])
>>>
>>> M = M.tocsc()
>>>
>>> M.indptr, M.indices
(array([0, 2, 4, 8], dtype=int32), array([0, 7, 3, 4, 1, 2, 5, 6], dtype=int32))
As we can see after conversion the internal representation of the sparse matrix yields the indices grouped and sorted:
>>> groups = np.split(M.indices, M.indptr[1:-1])
>>> groups
[array([0, 7], dtype=int32), array([3, 4], dtype=int32), array([1, 2, 5, 6], dtype=int32)]
>>>
We could have obtained the same using a stable argsort:
>>> np.argsort(idx, kind='mergesort')
array([0, 7, 3, 4, 1, 2, 5, 6])
>>>
But sparse matrices are actually faster, even when we allow argsort to use a faster non-stable algorithm:
>>> m, n = 1000, 100000
>>> idx = np.random.randint(0, m, (n,))
>>> data = np.arange(n)
>>>
>>> timeit('sparse.csr_matrix((data, idx, np.arange(n+1)), (n, m)).tocsc()', **kwds)
2.250748165184632
>>> timeit('np.argsort(idx)', **kwds)
5.783584725111723
If we require argsort to keep groups sorted, the difference is even larger:
>>> timeit('np.argsort(idx, kind="mergesort")', **kwds)
10.507467685034499