问题
In the following example I try to compute the first coefficient from a linear model for time t = 1 until t. It's an expanding rolling window.
It works well with ungrouped data, but when grouped by case, I get the error Error: Column
coef1must be length 10 (the group size) or one, not 30
.
How can I handle grouped data?
library(dplyr)
library(slider)
get_coef1 <- function(data) {
coef1 <- lm(data = data, r1 ~ r2 + r3) %>%
coef() %>%
.["r2"] %>%
unname()
return(coef1)
}
data <- tibble(t = rep(1:10, 3),
case = c(rep("a", 10), rep("b", 10), rep("c", 10)),
r1 = rnorm(30),
r2 = rnorm(30),
r3 = rnorm(30))
data %>%
# ungroup() %>%
group_by(case) %>%
mutate(coef1 = slider::slide_dbl(., ~get_coef1(.x),
.before = Inf, .complete = T))
回答1:
You have to first tidyr::nest
the cases. Within the nested tibbles
(accessed via purrr::map
) you can then apply slide
(same technique as with purrr::map
). The important point is that you do not want to slide
across cases, but only within cases.
library(dplyr)
library(tidyr)
library(purrr)
library(slider)
get_coef1 <- function(data) {
coef1 <- lm(data = data, r1 ~ r2 + r3) %>%
coef() %>%
.["r2"] %>%
unname()
return(coef1)
}
data <- tibble(t = rep(1:10, 3),
case = c(rep("a", 10), rep("b", 10), rep("c", 10)),
r1 = rnorm(30),
r2 = rnorm(30),
r3 = rnorm(30))
data %>%
# ungroup() %>%
group_by(case) %>% nest() %>%
mutate(rollreg = map(data, ~ .x %>% mutate(coef1 = slider::slide_dbl(., ~get_coef1(.x), .before = Inf, .complete = TRUE)))) %>%
select(-data) %>% unnest(rollreg)
I have been trying for a while to use the new dplyr::nest_by()
from dplyr 1.0.0 trying to use summarise
in combination with the rowwise
cases but couldn't get that to work.
来源:https://stackoverflow.com/questions/61390850/rolling-window-sliderslide-with-grouped-data