问题
I'm having trouble efficiently loading data into a sparse matrix format in R.
Here is an (incomplete) example of my current strategy:
library(Matrix)
a1=Matrix(0,5000,100000,sparse=T)
for(i in 1:5000)
a1[i,idxOfCols]=x
Where x is usually around length 20. This is not efficient and eventually slows to a crawl. I know there is a better way but wasn't sure how. Suggestions?
回答1:
You can populate the matrix all at once:
library(Matrix)
n <- 5000
m <- 1e5
k <- 20
idxOfCols <- sample(1:m, k)
x <- rnorm(k)
a2 <- sparseMatrix(
i=rep(1:n, each=k),
j=rep(idxOfCols, n),
x=rep(x, k),
dims=c(n,m)
)
# Compare
a1 <- Matrix(0,5000,100000,sparse=T)
for(i in 1:n) {
a1[i,idxOfCols] <- x
}
sum(a1 - a2) # 0
回答2:
You don't need to use a for-loop. Yu can just use standard matrix indexing with a two column matrix:
a1[ cbind(i,idxOfCols) ] <- x
来源:https://stackoverflow.com/questions/9650851/efficiently-load-a-sparse-matrix-in-r