How to create a lag variable within each group?

前端 未结 5 1589
没有蜡笔的小新
没有蜡笔的小新 2020-11-22 04:45

I have a data.table:

set.seed(1)
data <- data.table(time = c(1:3, 1:4),
                   groups = c(rep(c(\"b\", \"a\"), c(3, 4))),
                   v         


        
5条回答
  •  余生分开走
    2020-11-22 05:27

    Using package dplyr:

    library(dplyr)
    data <- 
        data %>%
        group_by(groups) %>%
        mutate(lag.value = dplyr::lag(value, n = 1, default = NA))
    

    gives

    > data
    Source: local data table [7 x 4]
    Groups: groups
    
      time groups       value   lag.value
    1    1      a  0.07614866          NA
    2    2      a -0.02784712  0.07614866
    3    3      a  1.88612245 -0.02784712
    4    1      b  0.26526825          NA
    5    2      b  1.23820506  0.26526825
    6    3      b  0.09276648  1.23820506
    7    4      b -0.09253594  0.09276648
    

    As noted by @BrianD, this implicitly assumes that value is sorted by group already. If not, either sort it by group, or use the order_by argument in lag. Also note that due to an existing issue with some versions of dplyr, for safety, arguments and the namespace should be explicitly given.

提交回复
热议问题