问题
I have data frame mydata
such as the following:
col1 col2
1 1 1
2 1 2
3 1 3
4 2 1
5 2 2
6 2 3
Y want to lag col2
within groups in col1
, so my expected result would be as follwing:
col1 col2
1 1 NA
2 1 1
3 1 2
4 2 NA
5 2 1
6 2 2
Follwing the procedure from [this answer][1] I try
with_lagged_col2 =
mydata %>% group_by(col1) %>% arrange(col1) %>%
mutate(laggy = dplyr::lag(col2, n = 1, default = NA))
And what I actually get is
# A tibble: 6 x 3
# Groups: col1 [2]
col1 col2 laggy
<dbl> <dbl> <dbl>
1 1 1 NA
2 1 2 1
3 1 3 2
4 2 1 3
5 2 2 1
6 2 3 2
Why is group_by
being ignored here?
回答1:
You don't need that arrange:
with_lagged_col2 =
mydata %>% group_by(col1) %>% # groups data by col1
mutate(laggy = dplyr::lag(col2, n = 1, default = NA)) # creates new lagged variable of col1, the missing value i.e. first row is NA
来源:https://stackoverflow.com/questions/51454717/lag-colum-by-group-in-dplyr