Max in a sliding window in NumPy array

后端未结

关注

 5  1229

囚心锁ツ 2020-11-30 10:13

I want to create an array which holds all the max()es of a window moving through a given numpy array. I\'m sorry if this sounds confusing. I\'ll give an examp

5条回答

广开言路 (楼主)

2020-11-30 11:02
I have tried several variants now and would declare the Pandas version as the winner of this performance race. I tried several variants, even using a binary tree (implemented in pure Python) for quickly computing maxes of arbitrary subranges. (Source available on demand). The best algorithm I came up with myself was a plain rolling window using a ringbuffer; the max of that only needed to be recomputed completely if the current max value was dropped from it in this iteration; otherwise it would remain or increase to the next new value. Compared with the old libraries, this pure-Python implementation was faster than the rest.

In the end I found that the version of the libraries in question was highly relevant. The rather old versions I was mainly still using were way slower than the modern versions. Here are the numbers for 1M numbers, rollingMax'ed with a window of size 100k:
```
         old (slow HW)           new (better HW)
scipy:   0.9.0:  21.2987391949   0.13.3:  11.5804400444
pandas:  0.7.0:  13.5896410942   0.18.1:   0.0551438331604
numpy:   1.6.1:   1.17417216301  1.8.2:    0.537392139435
```
Here is the implementation of the pure numpy version using a ringbuffer:
```
def rollingMax(a, window):
  def eachValue():
    w = a[:window].copy()
    m = w.max()
    yield m
    i = 0
    j = window
    while j < len(a):
      oldValue = w[i]
      newValue = w[i] = a[j]
      if newValue > m:
        m = newValue
      elif oldValue == m:
        m = w.max()
      yield m
      i = (i + 1) % window
      j += 1
  return np.array(list(eachValue()))
```
For my input this works great because I'm handling audio data with lots of peaks in all directions. If you put a constantly decreasing signal into it (e. g. -np.arange(10000000)), then you will experience the worst case (and maybe you should reverse the input and the output in such cases).

I just include this in case someone wants to do this task on a machine with old libraries.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...