NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

后端未结

关注

 12  779

一生所求 2020-11-27 12:30

How do I get the exponential weighted moving average in NumPy just like the following in pandas?

import pandas as pd
import pandas_datareader as pdr
from dat


      
      
        
          12条回答        

        
                    
            
            
                         
                
              
              
                
                   再見小時候
                                             
                
                
                (楼主)
            
              
              
                2020-11-27 12:56
              

            
            
                        
Given alpha and windowSize, here's an approach to simulate the corresponding behavior on NumPy -

def numpy_ewm_alpha(a, alpha, windowSize):
    wghts = (1-alpha)**np.arange(windowSize)
    wghts /= wghts.sum()
    out = np.full(df.shape[0],np.nan)
    out[windowSize-1:] = np.convolve(a,wghts,'valid')
    return out


Sample runs for verification -

In [54]: alpha = 0.55
    ...: windowSize = 20
    ...: 

In [55]: df = pd.DataFrame(np.random.randint(2,9,(100)))

In [56]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()
    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)
    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))
    ...: 
Max. error : 5.10531254605e-07

In [57]: alpha = 0.75
    ...: windowSize = 30
    ...: 

In [58]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()
    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)
    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))

Max. error : 8.881784197e-16


Runtime test on bigger dataset -

In [61]: alpha = 0.55
    ...: windowSize = 20
    ...: 

In [62]: df = pd.DataFrame(np.random.randint(2,9,(10000)))

In [63]: %timeit df.ewm(alpha = alpha, min_periods=windowSize).mean()
1000 loops, best of 3: 851 µs per loop

In [64]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)
1000 loops, best of 3: 204 µs per loop




Further boost

For further performance boost we could avoid the initialization with NaNs and instead use the array outputted from np.convolve, like so -

def numpy_ewm_alpha_v2(a, alpha, windowSize):
    wghts = (1-alpha)**np.arange(windowSize)
    wghts /= wghts.sum()
    out = np.convolve(a,wghts)
    out[:windowSize-1] = np.nan
    return out[:a.size]  


Timings -

In [117]: alpha = 0.55
     ...: windowSize = 20
     ...: 

In [118]: df = pd.DataFrame(np.random.randint(2,9,(10000)))

In [119]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)
1000 loops, best of 3: 204 µs per loop

In [120]: %timeit numpy_ewm_alpha_v2(df.values.ravel(), alpha = alpha, windowSize = windowSize)
10000 loops, best of 3: 195 µs per loop

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它12个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复