问题
I'm looking to process columns by criteria like class or common pattern matching via grep
.
My first attempt did not work:
require(data.table)
test.table <- data.table(a=1:10,ab=1:10,b=101:110)
##this does not work and hangs on my machine
test.table[,lapply(names(test.table)[grep("a",names(test.table))], get)]
Ricardo Saporta notes in an answer that you can use this construct, but you have to wrap get
in a dummy function:
##this works
test.table[,lapply(names(test.table)[grep("a",names(test.table))], function(x) get(x))]
Why do you need the anonymous function?
(The preferred/cleaner method is via .SDcols
:)
test.table[,.SD,.SDcols=grep("a",names(test.table))]
test.table[, grep("a", names(test.table), with = FALSE]
回答1:
This is a function of lapply
, not really data.table
From the lapply
documentation:
For historical reasons, the calls created by lapply are unevaluated, and code has been written (e.g. bquote) that relies on this. This means that the recorded call is always of the form FUN(X[[0L]], ...), with 0L replaced by the current integer index. This is not normally a problem, but it can be if FUN uses sys.call or match.call or if it is a primitive function that makes use of the call. This means that it is often safer to call primitive functions with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is required in R 2.7.1 to ensure that method dispatch for is.numeric occurs correctly.
Update re @Hadley's and @DWin's comments:
EE <- new.env()
EE$var1 <- "I am var1 in EE"
EE$var2 <- "I am var2 in EE"
## Calling get directly
with(EE, lapply(c("var1", "var2"), get))
Error in FUN(c("var1", "var2")[[1L]], ...) : object 'var1' not found
## Calling get via an anonymous function
with(EE, lapply(c("var1", "var2"), function(x) get(x)))
[[1]]
[1] "I am var1 in EE"
[[2]]
[1] "I am var2 in EE"
with(EE, lapply(c("var1", "var2"), rm))
Error in FUN(c("var1", "var2")[[1L]], ...) :
... must contain names or character strings
with(EE, lapply(c("var1", "var2"), function(x) rm(x)))
[[1]]
NULL
[[2]]
NULL
# var1 & var2 have now been removed
EE
<environment: 0x1154d0060>
回答2:
While @Ricardo is correct that it is safer to wrap primitive or functions that rely on method dispatch within an wrapper, here we can avoid this by setting the correct environment
for get
in which to search. The trick with lapply
is to use sys.parent(n)
(in this case n = 0
will work) to obtain the appropriate calling environments.
test.table[,lapply(grep('a',names(test.table),value=TRUE),
get, envir = sys.parent(0))]
(More information can be found here Using get inside lapply, inside a function)
回答3:
It's only because data.table evaluates the j() expression
(in simpler terms, everything after the first comma in DT[,...])
as an actual expression. So DT[,"Column1"]
returns "Column1"
, just as with(DT, "Column1")
returns "Column1"
. It's in the data table faq.
If you want, you can do:
DT[,names(test.table),with=F]
来源:https://stackoverflow.com/questions/18064602/why-do-i-need-to-wrap-get-in-a-dummy-function-within-a-j-lapply-call