Referring to data.table columns by names saved in variables

前端 未结 4 1047
借酒劲吻你
借酒劲吻你 2020-11-29 21:21

data.table is a fantastic R package and I am using it in a library I am developing. So far all is going very well, except for one complication. It seems to be m

4条回答
  •  南方客
    南方客 (楼主)
    2020-11-29 22:27

    If you are going to be doing complicated operations inside your j expressions, you should probably use eval and quote. One problem with that in current version of data.table is that the environment of eval is not always correctly processed - eval and quote in data.table (Note: There has been an update to that answer based on an update to the package.) - and the current fix for that is to add .SD to eval. As far as I can tell from a few tests that I've run this doesn't affect speed (the way e.g. having .SD[1] in j would).

    Interestingly this issue only plagues the j and you'll be fine using eval normally in i (where .SD is not available anyway).

    The other problem is assignment, and there you have to have strings. I know one way to extract the string name from a quoted expression - it's not pretty, but it works. Here's an example combining everything together:

    x = data.table(dist = c(1:10), val = c(1:10))
    distcol = quote(dist)
    valcol = quote(val)
    
    x[eval(valcol) < 5,
      capture.output(str(distcol, give.head = F)) := eval(distcol)*sum(eval(distcol, .SD))]
    

    Note how I was ok not adding .SD in one eval(distcol), but won't be if I take it out of the other eval.

    Another option is to use get:

    diststr = "dist"
    valstr = "val"
    
    x[get(valstr) < 5, c(diststr) := get(diststr)*sum(get(diststr))]
    

提交回复
热议问题