sum multiple columns by group with tapply

前端 未结 3 1968
北荒
北荒 2020-12-15 23:03

I wanted to sum individual columns by group and my first thought was to use tapply. However, I cannot get tapply to work. Can tapply

3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-15 23:45

    I looked at the source code for by, as EDi suggested. That code was substantially more complex than my change to the one line in tapply. I have now found that my.tapply does not work with the more complex scenario below where apples and cherries are summed by state and county. If I get my.tapply to work with this case I can post the code here later:

    df.2 <- read.table(text = '
    
        state   county   apples   cherries   plums
           AA        1        1          2       3
           AA        1        1          2       3
           AA        2       10         20      30
           AA        2       10         20      30
           AA        3      100        200     300
           AA        3      100        200     300
    
           BB        7       -1         -2      -3
           BB        7       -1         -2      -3
           BB        8      -10        -20     -30
           BB        8      -10        -20     -30
           BB        9     -100       -200    -300
           BB        9     -100       -200    -300
    
    ', header = TRUE, stringsAsFactors = FALSE)
    
    # my function works
    
       tapply(df.2$apples  , list(df.2$state, df.2$county), function(x) {sum(x)})
    my.tapply(df.2$apples  , list(df.2$state, df.2$county), function(x) {sum(x)})
    
    # my function works
    
       tapply(df.2$cherries, list(df.2$state, df.2$county), function(x) {sum(x)})
    my.tapply(df.2$cherries, list(df.2$state, df.2$county), function(x) {sum(x)})
    
    # my function does not work
    
    my.tapply(df.2[,3:4], list(df.2$state, df.2$county), function(x) {colSums(x)})
    

提交回复
热议问题