Calculate rolling correlation with pandas

前端 未结 3 1985
死守一世寂寞
死守一世寂寞 2021-01-03 16:46

I have a list of 10 stocks differentiated by PERMNO. I would like to group those stocks by PERMNO and calculate the rolling correlation between the stock return (RET) for ea

相关标签:
3条回答
  • 2021-01-03 16:56

    Running rolling.corr() on Python 3.5 generates a warning the function is deprecated and may stop working in future. Using Series.rolling(window=<period>).corr(other=series) instead is recommended. E.g.

    data['scrip1DailyReturn'].rolling(window=90).corr(other=data['scrip2DailyReturn'])
    
    0 讨论(0)
  • 2021-01-03 17:09

    I found an efficient solution. Fairly simple.

    def roll_corr_groupby(x,i):
        x['Z'] = rolling_corr(x['col 1'], x['col 2'],i) 
        return x
    
    x.groupby(['key']).apply(roll_corr_groupby)
    x.head()
    
    0 讨论(0)
  • 2021-01-03 17:11

    Use pandas.rolling_corr, not DataFrame.rolling_corr. Besides, groupby returns a generator. See below code.

    Code:

    import pandas as pd
    
    df = pd.read_csv("color.csv")
    df_gen = df.copy().groupby("Color")
    
    for key, value in df_gen:
        print "key: {}".format(key)
        print value.rolling_corr(value["Value1"],value["Value2"], 3)
    

    Output:

    key: Blue
    1          NaN
    3          NaN
    6     0.931673
    8     0.865066
    10    0.089304
    12   -0.998656
    15   -0.971373
    17   -0.667316
    dtype: float64
    key: Red
    0          NaN
    2          NaN
    5    -0.911357
    9    -0.152221
    11   -0.971153
    14    0.438697
    18   -0.550727
    dtype: float64
    key: Yellow
    4          NaN
    7          NaN
    13   -0.040330
    16    0.879371
    dtype: float64
    

    You can change the loop part to the following to view the original dataframe post-grouping with a new column as well.

    for key, value in df_gen:
        value["ROLL_CORR"] = pd.rolling_corr(value["Value1"],value["Value2"], 3)
        print value
    

    Output:

       Color    Value1    Value2  ROLL_CORR
    1   Blue  0.951227  0.514999        NaN
    3   Blue  0.649112  0.513052        NaN
    6   Blue  0.148165  0.342205   0.931673
    8   Blue  0.626883  0.421530   0.865066
    10  Blue  0.286738  0.583811   0.089304
    12  Blue  0.966779  0.227340  -0.998656
    15  Blue  0.065493  0.887640  -0.971373
    17  Blue  0.757932  0.900103  -0.667316
    key: Red
       Color    Value1    Value2  ROLL_CORR
    0    Red  0.201435  0.981871        NaN
    2    Red  0.522955  0.357239        NaN
    5    Red  0.806326  0.310039  -0.911357
    9    Red  0.656126  0.678047  -0.152221
    11   Red  0.435898  0.908388  -0.971153
    14   Red  0.116419  0.555821   0.438697
    18   Red  0.793102  0.168033  -0.550727
    key: Yellow
         Color    Value1    Value2  ROLL_CORR
    4   Yellow  0.099474  0.143293        NaN
    7   Yellow  0.073128  0.749297        NaN
    13  Yellow  0.006777  0.318383  -0.040330
    16  Yellow  0.345647  0.993382   0.879371
    

    If you want to join them all together after processing (this might be confusing to others, by the way), just use concat after processing groups.

    import pandas as pd
    
    df = pd.read_csv("color.csv")
    df_gen = df.copy().groupby("Color")
    
    dfs = [] # Container for dataframes.
    
    for key, value in df_gen:
        value["ROLL_CORR"] = pd.rolling_corr(value["Value1"],value["Value2"], 3)
        print value
        dfs.append(value)
    
    df_final = pd.concat(dfs)
    print df_final
    

    Output:

         Color    Value1    Value2  ROLL_CORR
    1     Blue  0.951227  0.514999        NaN
    3     Blue  0.649112  0.513052        NaN
    6     Blue  0.148165  0.342205   0.931673
    8     Blue  0.626883  0.421530   0.865066
    10    Blue  0.286738  0.583811   0.089304
    12    Blue  0.966779  0.227340  -0.998656
    15    Blue  0.065493  0.887640  -0.971373
    17    Blue  0.757932  0.900103  -0.667316
    0      Red  0.201435  0.981871        NaN
    2      Red  0.522955  0.357239        NaN
    5      Red  0.806326  0.310039  -0.911357
    9      Red  0.656126  0.678047  -0.152221
    11     Red  0.435898  0.908388  -0.971153
    14     Red  0.116419  0.555821   0.438697
    18     Red  0.793102  0.168033  -0.550727
    4   Yellow  0.099474  0.143293        NaN
    7   Yellow  0.073128  0.749297        NaN
    13  Yellow  0.006777  0.318383  -0.040330
    16  Yellow  0.345647  0.993382   0.879371
    

    Hope this helps.

    0 讨论(0)
提交回复
热议问题