问题
I have 100,000 sparse matrices("dgCMatrix") store in a list object. The row number of every matrix is the same(8,000,000) and the size of the list is approximately 25 Gb. Now when I do:
do.call(cbind, theListofMatrices)
to combine all matrices into one big sparse matrix, I got "node stack overflow". Actually, I can't even do this with only 500 elements out of that list, which should output a sparse matrix with a size of only 100 Mb.
My speculation for this is that the cbind() function transformed the sparse matrix to a normal dense matrix and thus cause the stack overflow?
Actually, I have tried something like this:
tmp = do.call(cbind, theListofMatrices[1:400])
this works fine, and tmp is still a sparse matrix with a size of 95 Mb, and then I tried:
> tmp = do.call(cbind, theListofMatrices[1:410])
Error in stopifnot(0 <= deparse.level, deparse.level <= 2) :
node stack overflow
and then the error occurred. However, I am having no trouble doing something like:
cbind(tmp, tmp, tmp, tmp)
thus, I believe it has something to do with do.call()
Reduce() seems to solve my problem, though I still don't know the reason why do.call() crushes.
回答1:
The problem is not in do.call()
but due to the way cbind
from the Matrix package is implemented. It uses recursion to bind the individual arguments together. For instance, Matrix::cbind(mat1, mat2, mat3)
is translated to something along the lines of Matrix::cbind(mat1, Matrix::cbind(mat2, mat3))
.
Since do.call(cbind, theListofMatrices)
is basically cbind(theListofMatrices[[1]], theListofMatrices[[2]], ...)
you have too many arguments to the cbind
function and you will end up with a recursion that's nested too deeply and it will fail.
Thus, Ben's comment to use Reduce()
is a good way to work around that issue since it avoids the recursion and replaces it with an iteration:
tmp <- Reduce(cbind, theListofMatrices[-1], theListofMatrices[[1]])
回答2:
In R: a 2-column matrix can have up to 2^30-1 rows = 1073,741,823 rows. So, I would check the row number and check the RAM size to make sure it can accommodate the big matrix size.
来源:https://stackoverflow.com/questions/37581417/getting-node-stack-overflow-when-cbind-multiple-sparse-matrices