Getting “node stack overflow” when cbind multiple sparse matrices

瘦欲@ 提交于 2019-12-11 02:25:07

问题


I have 100,000 sparse matrices("dgCMatrix") store in a list object. The row number of every matrix is the same(8,000,000) and the size of the list is approximately 25 Gb. Now when I do:

do.call(cbind, theListofMatrices)

to combine all matrices into one big sparse matrix, I got "node stack overflow". Actually, I can't even do this with only 500 elements out of that list, which should output a sparse matrix with a size of only 100 Mb.

My speculation for this is that the cbind() function transformed the sparse matrix to a normal dense matrix and thus cause the stack overflow?

Actually, I have tried something like this:

tmp = do.call(cbind, theListofMatrices[1:400])

this works fine, and tmp is still a sparse matrix with a size of 95 Mb, and then I tried:

> tmp = do.call(cbind, theListofMatrices[1:410])
Error in stopifnot(0 <= deparse.level, deparse.level <= 2) : 
  node stack overflow

and then the error occurred. However, I am having no trouble doing something like:

cbind(tmp, tmp, tmp, tmp)

thus, I believe it has something to do with do.call()

Reduce() seems to solve my problem, though I still don't know the reason why do.call() crushes.


回答1:


The problem is not in do.call() but due to the way cbind from the Matrix package is implemented. It uses recursion to bind the individual arguments together. For instance, Matrix::cbind(mat1, mat2, mat3) is translated to something along the lines of Matrix::cbind(mat1, Matrix::cbind(mat2, mat3)). Since do.call(cbind, theListofMatrices) is basically cbind(theListofMatrices[[1]], theListofMatrices[[2]], ...) you have too many arguments to the cbind function and you will end up with a recursion that's nested too deeply and it will fail.

Thus, Ben's comment to use Reduce() is a good way to work around that issue since it avoids the recursion and replaces it with an iteration:

tmp <- Reduce(cbind, theListofMatrices[-1], theListofMatrices[[1]])



回答2:


In R: a 2-column matrix can have up to 2^30-1 rows = 1073,741,823 rows. So, I would check the row number and check the RAM size to make sure it can accommodate the big matrix size.



来源:https://stackoverflow.com/questions/37581417/getting-node-stack-overflow-when-cbind-multiple-sparse-matrices

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!