问题
I\'m trying to apply a function to a group of columns in a large data.table without referring to each one individually.
a <- data.table(
a=as.character(rnorm(5)),
b=as.character(rnorm(5)),
c=as.character(rnorm(5)),
d=as.character(rnorm(5))
)
b <- c(\'a\',\'b\',\'c\',\'d\')
with the MWE above, this:
a[,b=as.numeric(b),with=F]
works, but this:
a[,b[2:3]:=data.table(as.numeric(b[2:3])),with=F]
doesn\'t work. What is the correct way to apply the as.numeric
function to just columns 2 and 3 of a
without referring to them individually.
(In the actual data set there are tens of columns so it would be impractical)
回答1:
The idiomatic approach is to use .SD
and .SDcols
You can force the RHS to be evaluated in the parent frame by wrapping in ()
a[, (b) := lapply(.SD, as.numeric), .SDcols = b]
For columns 2:3
a[, 2:3 := lapply(.SD, as.numeric), .SDcols = 2:3]
or
mysubset <- 2:3
a[, (mysubset) := lapply(.SD, as.numeric), .SDcols = mysubset]
来源:https://stackoverflow.com/questions/16783598/apply-a-function-to-a-subset-of-data-table-columns-by-column-indices-instead-of