Efficiently computing a linear combination of data.table columns

后端 未结 2 376
-上瘾入骨i
-上瘾入骨i 2020-11-30 09:38

I have nc columns in a data.table, and nc scalars in a vector. I want to take a linear combination of the columns, but I don\'t know ahead of time

相关标签:
2条回答
  • 2020-11-30 10:11

    This is almost 2x faster for me than your manual version:

    Reduce("+", lapply(names(DT), function(x) DT[[x]] * cf[x]))
    
    benchmark(manual = DT[, list(cf['A']*A+cf['B']*B+cf['C']*C+cf['D']*D)],
              reduce = Reduce('+', lapply(names(DT), function(x) DT[[x]] * cf[x])))
    #    test replications elapsed relative user.self sys.self user.child sys.child
    #1 manual          100    1.43    1.744      1.08     0.36         NA        NA
    #2 reduce          100    0.82    1.000      0.58     0.24         NA        NA
    

    And to iterate over just mycols, replace names(DT) with mycols in lapply.

    0 讨论(0)
  • 2020-11-30 10:12

    Add this option to your benchmark call:

    ops = as.matrix(DT) %*% cf
    

    On my device it was 30% faster than the matrix multiplication you tried.

    0 讨论(0)
提交回复
热议问题