问题
I have a data.table
that resembles the one below.
tab <- data.table(a = c(NA, 42190, NA), b = c(42190, 42190, NA), c = c(40570, 42190, NA))
tab
a b c
1: NA 42190 40570
2: 42190 42190 42190
3: NA NA NA
Upon specification of a vector of row indices, and a vector of column indices, I would like a vector returned containing the points in tab
corresponding to the specified vector of row indices and column indices.
For example, suppose I wanted to get the diagonal elements in tab
. I would specify two vectors,
ri <- 1:3
ci <- 1:3
and some function, function(ri, ci, tab)
, would return the diagonal elements of tab
.
If tab
were a data.frame
, I would do what's below,
as.data.frame(tab)[cbind(ri, ci)]
but, I would like to avoid data.frame
syntax. I would also like to avoid a for
loop, as this tends to be slow.
回答1:
There is a faster way to do this than coercing to either matrix or data.frame. Just use the [data.frame
function.
`[.data.frame`( tab, cbind(ri,ci) )
[1] NA 42190 NA
This is the functional syntax for the [.data.frame
function.
回答2:
(UPDATE: @42-'s answer using [.data.frame is best. But here's my previous answer)
as.matrix(tab)[cbind(ri, ci)]
is going to be faster and more memory-efficient than melt
.
I see no reason you don't declare your DT as a matrix, as @thelatemail recommends. This is one case where DT syntax is not as powerful as matrix.
(For memory-efficiency with large DTs, data.table has commands setDF
/setDT
to allow converting to/from DF/DT without copying, but I'm not aware it has an equivalent for matrix. If that is something people do a lot of, it might make a good enhance request for DT.
For really big dimensions, you might look into Matrix's sparse-matrix formats package), or chunk your data, or use disk-backed data structures.)
来源:https://stackoverflow.com/questions/50635084/taking-a-data-table-slice-with-a-sequence-of-row-col-indices