Rolling Correlation with Groupby in Pandas

巧了我就是萌 提交于 2019-12-11 02:16:05

问题


Assuming I have a Pandas dataframe similar to the below, how would I get the rolling correlation (for 2 days in this example) between 2 specific columns and group by the 'ID' column? I am familiar with the Pandas rolling_corr() function but I cannot figure out how to combine that with the groupby() clause.

What I have:

ID  Date    Val1    Val2
A   1-Jan   45      22
A   2-Jan   15      66
A   3-Jan   55      13
B   1-Jan   41      12
B   2-Jan   87      45
B   3-Jan   82      66
C   1-Jan   33      34
C   2-Jan   15      67
C   3-Jan   46      22

What I need:

ID  Date    Val1    Val2    Rolling_Corr
A   1-Jan   45      22  
A   2-Jan   15      66      0.1
A   3-Jan   55      13      0.16
B   1-Jan   41      12  
B   2-Jan   87      45      0.15
B   3-Jan   82      66      0.05
C   1-Jan   33      34  
C   2-Jan   15      67      0.09
C   3-Jan   46      22      0.11

Thanks!


回答1:


You can actually start with the simple approach here: Pandas Correlation Groupby

and then add rolling(3) like this:

df.groupby('ID')[['Val1','Val2']].rolling(3).corr()

I've changed the window from 2 to 3 because you'll only get 1 or -1 with a window size of 2. Unfortunately, that output (not shown) is a bit verbose because it outputs a 2x2 correlation matrix when all you need is a scalar. But with an additional line you can make the output more concise:

df2 = df.groupby('ID')[['Val1','Val2']].rolling(3).corr()

df2.groupby(level=[0,1]).last()['Val1']

ID   
A   0         NaN
    1         NaN
    2   -0.996539
B   3         NaN
    4         NaN
    5    0.879868
C   6         NaN
    7         NaN
    8   -0.985529


来源:https://stackoverflow.com/questions/28998998/rolling-correlation-with-groupby-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!