Calculate first difference by group in R

后端 未结 4 1560
野的像风
野的像风 2020-12-18 11:54

I was wondering if someone could help me calculate the first difference of a score by group. I know it should be a simple process but for some reason I\'m having trouble doi

相关标签:
4条回答
  • 2020-12-18 12:05

    Another approach using dplyr:

    library(dplyr)
    
    score <- c(10,30,14,20,6)
    group <- c(rep(1001,2),rep(1005,3))
    df <- data.frame(score,group)
    
    df %>%
      group_by(group) %>%
      mutate(first_diff = score - lag(score))
    
    0 讨论(0)
  • 2020-12-18 12:18

    This should do the trick, although it uses loops rather than an apply function, so there is likely room for improvement in code clarity/efficiency

    out = numeric()
    #out[1] will always be NA
    out[1] = NA
    for(i in 2:nrow(df)){
      if(df$group[i]==df$group[(i-1)]){
        out[i]=df$score[i]-df$score[(i-1)]
      } 
      else {
        out[i]=NA
      }
    }
    out
    [1]  NA  20  NA   6 -14
    
    0 讨论(0)
  • 2020-12-18 12:20

    This is one way using base R

    df$diff <- unlist(by(df$score , list(df$group) , function(i) c(NA,diff(i))))
    

    or

    df$diff <- ave(df$score , df$group , FUN=function(i) c(NA,diff(i)))
    


    or using data.table - this will be more efficient for larger data.frames

    library(data.table)
    dt <- data.table(df)
    setkey(dt,group)
    dt[,diff:=c(NA,diff(score)),by=group]
    
    0 讨论(0)
  • 2020-12-18 12:21

    Although not exactly what you are looking for, ddply within the 'plyr' package can be used ta calculate the differences by group

    library(plyr)
    out<-ddply(df,.(group),summarize,d1=diff(score,1))
    
    0 讨论(0)
提交回复
热议问题