How to sum rows based on multiple conditions - R? [duplicate]

久未见 提交于 2019-12-04 15:33:03
ExperimenteR

Easy with aggregate

aggregate(cover~species+plotID, data=df_original, FUN=sum) 

Easier with data.table

as.data.table(df_original)[, sum(cover), by = .(plotID, species)]

You can do this in a number of ways. Using base-r, dplyr and data.table would be the most typical.

Here is dplyr 's way:

library(dplyr)

df_original %>% group_by(plotID, species) %>% summarize(cover = sum(cover))

#          plotID species     cover
#1 SUF200001035014    ABBA 26.893939
#2 SUF200001035014    BEPA  5.681818
#3 SUF200001035014   PIBA2  9.469697
#4 SUF200001035014    PIMA 16.287879
#5 SUF200001035014    PIRE  1.893939
#6 SUF200046012040   PIBA2 20.454546
#7 SUF200046012040    PIMA 27.651515
#8 SUF200046012040    PIRE 11.363636
#9 SUF200046012040   POTR5 31.439394

This would be the base-r way:

aggregate(df_original$cover, by=list(df_original$plotID, df_original$species), FUN=sum)

And a data.table way -

    library(data.table)
    DT <- as.data.table(df_original)
    DT[, lapply(.SD,sum), by = "plotID,species"]

As mentioned above, ddply from the plyr package

    library(plyr)
    ddply(df_original, c("plotID","species"), summarise,cover2= sum(cover))


            plotID          species cover2
    1       SUF200001035014 ABBA    26.893939
    2       SUF200001035014 BEPA    5.681818
    3       SUF200001035014 PIBA2   9.469697
    4       SUF200001035014 PIMA    16.287879
    5       SUF200001035014 PIRE    1.893939
    6       SUF200046012040 PIBA2   20.454546
    7       SUF200046012040 PIMA    27.651515
    8       SUF200046012040 PIRE    11.363636
    9       SUF200046012040 POTR5   31.439394
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!