Apply a rolling function by group in r (zoo, data.table)

六月ゝ 毕业季﹏ 提交于 2021-02-05 11:07:19

问题


I am having trouble doing something fairly simple: apply a rolling function (standard deviation) by group in a data.table. My problem is that when I use a data.table with rollapply by some column, data.table recycles the observations as noted in the warning message below. I would like to get NAs for the observations that are outside of the window instead of recycling the standard deviations.

This is my approach so far using iris, and a rolling window of size 2, aligned to the right:

library(zoo)
library(data.table)

A <- iris
setDT(A)
A[,stdev := rollapply(Petal.Width, width = 2, sd, align = 'right', partial = F),by = Species]
Warning messages:
1: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 1 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
2: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 2 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
3: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 3 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).

> A
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species     stdeev      stdev
  1:          5.1         3.5          1.4         0.2    setosa 0.00000000 0.00000000
  2:          4.9         3.0          1.4         0.2    setosa 0.00000000 0.00000000
  3:          4.7         3.2          1.3         0.2    setosa 0.00000000 0.00000000
  4:          4.6         3.1          1.5         0.2    setosa 0.00000000 0.00000000
  5:          5.0         3.6          1.4         0.2    setosa 0.14142136 0.14142136
 ---                                                                                  
146:          6.7         3.0          5.2         2.3 virginica 0.28284271 0.28284271
147:          6.3         2.5          5.0         1.9 virginica 0.07071068 0.07071068
148:          6.5         3.0          5.2         2.0 virginica 0.21213203 0.21213203
149:          6.2         3.4          5.4         2.3 virginica 0.35355339 0.35355339
150:          5.9         3.0          5.1         1.8 virginica 0.42426407 0.42426407

回答1:


Add fill=NA to rollapply. This will ensure that a vector of length 50 (rather than 49) is returned, with NA as the first value (since align="right"), avoiding recycling.

A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species      stdev
1            5.1         3.5          1.4         0.2     setosa         NA
2            4.9         3.0          1.4         0.2     setosa 0.00000000
3            4.7         3.2          1.3         0.2     setosa 0.00000000
...
51           7.0         3.2          4.7         1.4 versicolor         NA
52           6.4         3.2          4.5         1.5 versicolor 0.07071068
53           6.9         3.1          4.9         1.5 versicolor 0.00000000
...
101          6.3         3.3          6.0         2.5  virginica         NA
102          5.8         2.7          5.1         1.9  virginica 0.42426407
103          7.1         3.0          5.9         2.1  virginica 0.14142136


来源:https://stackoverflow.com/questions/43107071/apply-a-rolling-function-by-group-in-r-zoo-data-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!