How do I sum the values of columns in several tables if tables have different lengths?

后端 未结 3 403
离开以前
离开以前 2021-01-17 23:35

Alright, this should be an easy one but I\'m looking for a solution that\'s as fast as possible.

Let\'s say I have 3 tables (the number of tables will be much larger

3条回答
  •  遇见更好的自我
    2021-01-18 00:34

    You could use rowsum(). The output will be slightly different than what you show, but you can always restructure it after the calculations. rowsum() is known to be very efficient.

    x <- c(tab1, tab2, tab3)
    rowsum(x, names(x))
    #   [,1]
    # 1    7
    # 2    3
    # 3    4
    # 4    3
    # 5    1
    

    Here's a benchmark with akrun's data.table suggestion added in as well.

    library(microbenchmark)
    library(data.table)
    
    xx <- rep(x, 1e5)
    
    microbenchmark(
        tapply = tapply(xx, names(xx), FUN=sum),
        rowsum = rowsum(xx, names(xx)),
        data.table = data.table(xx, names(xx))[, sum(xx), by = V2]
    )
    # Unit: milliseconds
    #        expr       min        lq      mean    median        uq       max neval
    #      tapply 150.47532 154.80200 176.22410 159.02577 204.22043 233.34346   100
    #      rowsum  41.28635  41.65162  51.85777  43.33885  45.43370 109.91777   100
    #  data.table  21.39438  24.73580  35.53500  27.56778  31.93182  92.74386   100
    

提交回复
热议问题