NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

后端 未结 12 744
一生所求
一生所求 2020-11-27 12:30

How do I get the exponential weighted moving average in NumPy just like the following in pandas?

import pandas as pd
import pandas_datareader as pdr
from dat         


        
12条回答
  •  迷失自我
    2020-11-27 12:45

    I think I have finally cracked it!

    Here's a vectorized version of numpy_ewma function that's claimed to be producing the correct results from @RaduS's post -

    def numpy_ewma_vectorized(data, window):
    
        alpha = 2 /(window + 1.0)
        alpha_rev = 1-alpha
    
        scale = 1/alpha_rev
        n = data.shape[0]
    
        r = np.arange(n)
        scale_arr = scale**r
        offset = data[0]*alpha_rev**(r+1)
        pw0 = alpha*alpha_rev**(n-1)
    
        mult = data*pw0*scale_arr
        cumsums = mult.cumsum()
        out = offset + cumsums*scale_arr[::-1]
        return out
    

    Further boost

    We can boost it further with some code re-use, like so -

    def numpy_ewma_vectorized_v2(data, window):
    
        alpha = 2 /(window + 1.0)
        alpha_rev = 1-alpha
        n = data.shape[0]
    
        pows = alpha_rev**(np.arange(n+1))
    
        scale_arr = 1/pows[:-1]
        offset = data[0]*pows[1:]
        pw0 = alpha*alpha_rev**(n-1)
    
        mult = data*pw0*scale_arr
        cumsums = mult.cumsum()
        out = offset + cumsums*scale_arr[::-1]
        return out
    

    Runtime test

    Let's time these two against the same loopy function for a big dataset.

    In [97]: data = np.random.randint(2,9,(5000))
        ...: window = 20
        ...:
    
    In [98]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized(data, window))
    Out[98]: True
    
    In [99]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized_v2(data, window))
    Out[99]: True
    
    In [100]: %timeit numpy_ewma(data, window)
    100 loops, best of 3: 6.03 ms per loop
    
    In [101]: %timeit numpy_ewma_vectorized(data, window)
    1000 loops, best of 3: 665 µs per loop
    
    In [102]: %timeit numpy_ewma_vectorized_v2(data, window)
    1000 loops, best of 3: 357 µs per loop
    
    In [103]: 6030/357.0
    Out[103]: 16.89075630252101
    

    There is around a 17 times speedup!

提交回复
热议问题