do.call to build and execute data.table commands

馋奶兔 提交于 2019-12-11 03:28:00

问题


I have a small data.table representing one record per test cell (AB testing results) and am wanting to add several more columns that compare each test cell, against each other test cell. In other words, the number of columns I want to add, will depend upon how many test cells are in the AB test in question.

My data.table looks like:

Group   Delta     SD.diff
Control     0           0
Cell1 0.00200 0.001096139
Cell2 0.00196 0.001095797
Cell3 0.00210 0.001096992
Cell4 0.00160 0.001092716

And I want to add the following columns (numbers are trash here):

Group v.Cell1    v.Cell2   v.Cell3   v.Cell4
Control  0.45       0.41      0.45      0.41 
Cell1    0.50       0.58      0.48      0.66
Cell2    0.58       0.50      0.58      0.48
Cell3    0.48       0.58      0.50      0.70
Cell4    0.66       0.48      0.70      0.50

I am sure that do.call is the way to go, but I cant work out how to embed one do.call inside another to generate the script... and I can't work out how to then execute the scripts (20 lines in total). The closest I am currently is:

a <- do.call("paste",c("test.1.results <- mutate(test.1.results, P.Better.",list(unlist(test.1.results[,Group]))," = pnorm(Delta, test.1.results['",list(unlist(test.1.results[,Group])),"'][,Delta], SD.diff,lower.tail=TRUE))", sep=""))

Which produces 5 script lines like:

test.1.results <- mutate(test.1.results, P.Better.Cell2 = pnorm(Delta, test.1.results['Cell2'][,Delta], SD.diff,lower.tail=TRUE))

Which only compares one test cell results against itself.. a 0.50 result (difference due to chance). No use what so ever as I need each test compared to each other.

Not sure where to go with this one.


回答1:


Update: In v1.8.11, FR #2077 is now implemented - set() can now add columns by reference, . From NEWS:

set() is able to add new columns by reference now. For example, set(DT, i=3:5, j="bla", 5L) is equivalent to DT[3:5, bla := 5L]. This was FR #2077. Tests added.


Tasks like this are often easier with set(). To demonstrate, here's a translation of what you have in the question (untested). But I realise you want something different than what you've posted (which I don't quite understand, quickly).

for (i in paste0("Cell",1:4))
  set(DT,                   # the data.table to update/add column by reference
    i=NULL,                 # no row subset, NULL is default anyway
    j=paste("P.Better.",i), # column name or position. must be name when adding
    value = pnorm(DT$Delta, DT[i][,Delta], DT$SD.diff, lower.tail=TRUE)

Note that you can add only a subset of a new column and the rest will be filled with NA. Both with := and set.



来源:https://stackoverflow.com/questions/13869144/do-call-to-build-and-execute-data-table-commands

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!