Python pandas calculate rolling stock beta using rolling apply to groupby object in vectorized fashion

前端未结

关注

 3  1960

一向 2020-12-17 06:45

I have a large data frame, df, containing 4 columns:

             id           period  ret_1m   mkt_ret_1m
131146       CAN00WG0     199609 -0.1538    0.0471


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   慢半拍i
                                             
                
                
                (楼主)
            
              
              
                2020-12-17 06:55
              

            
            
                        
def rolling_apply(df, period, func, min_periods=None):
    if min_periods is None:
        min_periods = period
    result = pd.Series(np.nan, index=df.index)

    for i in range(1, len(df)):
        sub_df = df.iloc[max(i-period, 0):i,:] #get a subsample to run
        if len(sub_df) >= min_periods:
            idx = sub_df.index[-1]+1 # mind the forward looking bias,your return in time t should not be inclued in the beta calculating in time t
            result[idx] = func(sub_df)
    return result

I fix a forward looking bias for Happy001's code. It's a finance problem, so it should be cautious.
I find that vlmercado's answer is so wrong. If you simply use pd.rolling_cov and pd.rolling_var you are making mistakes in finance. Firstly, it's obvious that the second stock CAN00WH0 do not have any NaN beta, since it use the return of CAN00WG0, which is wrong at all. Secondly, consider such a situation: a stock suspended for ten years, and you can also get that sample into your beta calculating.
I find that pandas.rolling also works for Timestamp, but it seems not ok with groupby. So I change the code of  Happy001's code . It's not the fastest way, but is at least 20x faster than the origin code.
crsp_daily['date']=pd.to_datetime(crsp_daily['date'])
crsp_daily=crsp_daily.set_index('date') # rolling needs a time serie index
crsp_daily.index=pd.DatetimeIndex(crsp_daily.index)
calc=crsp_daily[['permno','ret','mkt_ret']]
grp = calc.groupby('permno') #rolling beta for each stock
beta=pd.DataFrame()
for stock, sub_df in grp:
        sub2_df=sub_df[['ret','mkt_ret']].sort_index() 
        beta_m = sub2_df.rolling('1825d',min_periods=150).cov() # 5yr rolling beta , note that d for day, and you cannot use w/m/y, s/d are availiable.
        beta_m['beta']=beta_m['ret']/beta_m['mkt_ret']
        beta_m=beta_m.xs('mkt_ret',level=1,axis=0)
        beta=beta.append(pd.merge(sub_df,pd.DataFrame(beta_m['beta'])))
beta=beta.reset_index()
beta=beta[['date','permno','beta']]

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复