发表新帖

发表新帖

Fastest way to sort a large number of arrays in python

前端未结

关注

 3  2082

难免孤独 2021-01-21 14:46

I am trying to sort a large number of arrays in python. I need to perform the sorting for over 11 million arrays at once.

Also, it would be nice if I could directly get

3条回答

囚心锁ツ (楼主)

2021-01-21 15:09
Well for cases like those where you are interested in partial sorted indices, there's NumPy's argpartition.

You have the troublesome np.argsort in : w[np.argsort(z)[::-1]][:7], which is essentially w[idx], where idx = np.argsort(z)[::-1][:7].

So, idx could be calculated with np.argpartition, like so -
```
idx = np.argpartition(-z,np.arange(7))[:7]
```
That -z is needed because by default np.argpartition tries to get sorted indices in ascending order. So, to reverse it, we have negated the elements.

Thus, the proposed change in the original code would be :
```
func = w[np.argpartition(-z,np.arange(7))[:7]]
```
Runtime test -
```
In [162]: z = np.random.randint(0,10000000,(1100000)) # Random int array

In [163]: idx1 = np.argsort(z)[::-1][:7]
     ...: idx2 = np.argpartition(-z,np.arange(7))[:7]
     ...: 

In [164]: np.allclose(idx1,idx2) # Verify results
Out[164]: True

In [165]: %timeit np.argsort(z)[::-1][:7]
1 loops, best of 3: 264 ms per loop

In [166]: %timeit np.argpartition(-z,np.arange(7))[:7]
10 loops, best of 3: 36.5 ms per loop
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题