In R, how to sum certain rows of a data frame with certain logic?

前端 未结 5 1814
情书的邮戳
情书的邮戳 2021-01-06 17:33

Hi experienced R users,

It\'s kind of a simple thing. I want to sum x by Group.1 depending on one controllable variable.

I\'d like

5条回答
  •  渐次进展
    2021-01-06 18:16

    You could use the by function.

    For instance, given the following data.frame:

    d <- data.frame(Group.1=c(1,1,2,1,3,3,1,3),Group.2=c('Eggs'),x=1:8)
    
    > d
      Group.1 Group.2 x
    1       1    Eggs 1
    2       1    Eggs 2
    3       2    Eggs 3
    4       1    Eggs 4
    5       3    Eggs 5
    6       3    Eggs 6
    7       1    Eggs 7
    8       3    Eggs 8
    

    You can do this:

    num <- 3 # sum only the first 3 rows
    
    # The aggregation function:
    # it is called for each group receiving the 
    # data.frame subset as input and returns the aggregated row
    innerFunc <- function(subDf){
      # we create the aggregated row by taking the first row of the subset
      row <- head(subDf,1)
      # we set the x column in the result row to the sum of the first "num"
      # elements of the subset
      row$x <- sum(head(subDf$x,num))
      return(row)
    }
    # Here we call the "by" function:
    # it returns an object of class "by" that is a list of the resulting
    # aggregated rows; we want to convert it to a data.frame, so we call
    # rbind repeatedly by using "do.call(rbind, ... )"
    d2 <- do.call(rbind,by(data=d,INDICES=d$Group.1,FUN=innerFunc))
    
    > d2
      Group.1 Group.2  x
    1       1    Eggs  7
    2       2    Eggs  3
    3       3    Eggs 19
    

提交回复
热议问题