dplyr: lead() and lag() wrong when used with group_by()

后端 未结 3 1728
旧巷少年郎
旧巷少年郎 2020-12-02 09:46

I want to find the lead() and lag() element in each group, but had some wrong results.

For example, data is like this:

library(dplyr)
df = data.frame         


        
3条回答
  •  甜味超标
    2020-12-02 10:29

    It seems you have to pass additional argument to lag and lead functions. When I run your function without arrange, but with order_by added, everything seems to be ok.

    df %>%
    group_by(name) %>%
    mutate(next.score = lead(score, order_by=name),
    before.score = lag(score, order_by=name))
    

    Output:

      name score next.score before.score
    1   Al   100         60           NA
    2  Jen    80        100           NA
    3   Al    60         80          100
    4  Jen   100         60           80
    5   Al    80         NA           60
    6  Jen    60         NA          100
    

    My sessionInfo():

    R version 3.1.1 (2014-07-10)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    
    locale:
    [1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250        LC_MONETARY=Polish_Poland.1250
    [4] LC_NUMERIC=C                   LC_TIME=Polish_Poland.1250    
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] dplyr_0.4.1
    
    loaded via a namespace (and not attached):
    [1] assertthat_0.1  DBI_0.3.1       lazyeval_0.1.10 magrittr_1.5                parallel_3.1.1  Rcpp_0.11.5    
    [7] tools_3.1.1 
    

提交回复
热议问题