Pandas Rolling Apply custom

后端 未结 2 1470
野的像风
野的像风 2020-12-15 06:39

I have been following a similar answer here, but I have some questions when using sklearn and rolling apply. I am trying to create z-scores and do PCA with rolling apply, bu

相关标签:
2条回答
  • 2020-12-15 07:24

    As @BrenBarn commented, the rolling function needs to reduce a vector to a single number. The following is equivalent to what you were trying to do and help's highlight the problem.

    zscore = lambda x: (x - x.mean()) / x.std()
    tmp.rolling(5).apply(zscore)
    
    TypeError: only length-1 arrays can be converted to Python scalars
    

    In the zscore function, x.mean() reduces, x.std() reduces, but x is an array. Thus the entire thing is an array.


    The way around this is to perform the roll on the parts of the z-score calculation that require it, and not on the parts that cause the problem.

    (tmp - tmp.rolling(5).mean()) / tmp.rolling(5).std()
    

    0 讨论(0)
  • 2020-12-15 07:37

    Since x in lambda function represents a (rolling) series/ndarray, the lambda function can be coded like this (where x[-1] refers to current rolling data point):

    zscore = lambda x: (x[-1] - x.mean()) / x.std(ddof=1)
    

    Then it is OK to call:

    tmp.rolling(5).apply(zscore)
    

    Also noted that the degree of freedom defaults to 1 in tmp.rolling(5).std() In order to generate the same results as @piRSquared's, one has to specify the ddof for x.std(), which defaults to 0. --It took quite a while to figure this out!

    0 讨论(0)
提交回复
热议问题