Lag colum by group in dplyr [closed]

问题

I have data frame mydata such as the following:

  col1 col2
1    1    1
2    1    2
3    1    3
4    2    1
5    2    2
6    2    3

Y want to lag col2 within groups in col1, so my expected result would be as follwing:

  col1 col2
1    1    NA
2    1    1
3    1    2
4    2    NA
5    2    1
6    2    2

Follwing the procedure from [this answer][1] I try

with_lagged_col2 = 
  mydata %>% group_by(col1) %>% arrange(col1) %>% 
  mutate(laggy = dplyr::lag(col2, n = 1, default = NA))

And what I actually get is

# A tibble: 6 x 3
# Groups:   col1 [2]
   col1  col2 laggy
  <dbl> <dbl> <dbl>
1     1     1    NA
2     1     2     1
3     1     3     2
4     2     1     3
5     2     2     1
6     2     3     2

Why is group_by being ignored here?

回答1:

You don't need that arrange:

with_lagged_col2 = 
  mydata %>% group_by(col1) %>% # groups data by col1
  mutate(laggy = dplyr::lag(col2, n = 1, default = NA)) # creates new lagged variable of col1, the missing value i.e. first row is NA

来源：https://stackoverflow.com/questions/51454717/lag-colum-by-group-in-dplyr

标签

dplyr

window-functions

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!