问题
I have gotten in the habit of accessing data.table columns in j even when I do not need to:
require(data.table)
set.seed(1); n = 10
DT <- data.table(x=rnorm(n),y=rnorm(n))
frm <- formula(x~y)
DT[,lm(x~y)] # 1 works
DT[,lm(frm)] # 2 fails
lm(frm,data=DT) # 3 what I'll do instead
I expected # 2 to work, since lm should search for variables in DT and then in the global environment... Is there an elegant way to get something like # 2 to work?
In this case, I'm using lm, which takes a "data" argument, so # 3 works just fine.
EDIT. Note that this works:
x1 <- DT$x
y1 <- DT$y
frm1 <- formula(x1~y1)
lm(frm1)
and this, too:
rm(x1,y1)
bah <- function(){
x1 <- DT$x
y1 <- DT$y
frm1 <- formula(x1~y1)
lm(frm1)
}
bah()
EDIT2. However, this fails, illustrating @eddi's answer
frm1 <- formula(x1~y1)
bah1 <- function(){
x1 <- DT$x
y1 <- DT$y
lm(frm1)
}
bah1()
回答1:
The way lm works it looks for the variables used in the environment of the formula supplied. Since you create your formula in the global environment, it's not going to look in the j-expression environment, so the only way to make the exact expression lm(frm) work would be to add the appropriate variables to the correct environment:
DT[, {assign('x', x, environment(frm));
assign('y', y, environment(frm));
lm(frm)}]
Now obviously this is not a very good solution, and both Arun's and Josh's suggestions are much better and I'm just putting it here for the understanding of the problem at hand.
edit Another (possibly more perverted, and quite fragile) way would be to change the environment of the formula at hand (I do it permanently here, but you could revert it back, or copy it and then do it):
DT[, {setattr(frm, '.Environment', get('SDenv', parent.frame(2))); lm(frm)}]
Btw a funny thing is happening here - whenever you use get in j-expression, all of the variables get constructed (so don't use it if you can avoid it), and this is why I don't need to also use x and y in some way for data.table to know that those variables are needed.
来源:https://stackoverflow.com/questions/19311600/using-lmmy-formula-inside-data-tables-j