I have data with both numeric
and non-numeric
columns like this:
mydt
vnum1 vint1 vfac1 vch1
1: -0.30159484 8 3
By searching on SO for .SDcols
, I landed up on this answer, which I think explains quite nicely how to use it.
cols = sapply(mydt, is.numeric)
cols = names(cols)[cols]
mydt[, lapply(.SD, mean), .SDcols = cols]
# vnum1 vint1
# 1: -0.046491 4.5
Doing mydt[, sapply(mydt, is.numeric), with = FALSE]
(note: the "modern" way to do that is mydt[ , .SD, .SDcols = is.numeric]
)is not that efficient because it subsets your data.table with those columns and that makes a (deep) copy - more memory used unnecessarily.
And using colMeans
coerces the data.table into a matrix
, which again is not so memory efficient.