问题
How can I merge two large (around 500k columns and rows) sparse matrices of formal class dgCMatrix with different sizes (both columns and rows wise) in R?
Simplyfied example: I have a full 6x6 matrix
1 2 3 4 5 6
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
Now I want to merge a second matrix of different size:
3 4 5 6
1 0 1 0 0
3 0 0 1 0
4 1 0 0 0
The result should be:
1 2 3 4 5 6
1 0 0 0 1 0 0
2 0 0 0 0 0 0
3 0 0 0 0 1 0
4 1 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
I tried cbindX and merge but both didn't work as either:
only matrices and data.frames can be used
or
cannot coerce class "*structure("dgCMatrix", package = "Matrix")" to a data.frame.
However, I could not change my matrix to sparse=FALSE matrix class as suggested here in this post or to a data.frame, as in this case R cannot handle the matrix size on my machine anymore.
Any help would be highly appreciated. Thanks!
回答1:
One strategy would be to convert the matrices back to the same size and then add them.
A <- sparseMatrix(8, 8, x = 1)
B <- sparseMatrix(c(1,3,5), c(3,6,3), x = c(1,4,1))
You can access the indices of matrix B with summary(B) and then just recreate the matrix with sparseMatrix(i,j,x,dims) like you would a normal subsetting operation in R:
> summary(B)
5 x 6 sparse Matrix of class "dgCMatrix", with 3 entries
i j x
1 1 3 1
2 5 3 1
3 3 6 4
B <- sparseMatrix(i = summary(B)$i, j = summary(B)$j, x = summary(B)$x, dims = dim(A))
Then you can just add the matrices:
A = A + B
来源:https://stackoverflow.com/questions/31827512/merge-two-dgcmatrix-sparse-matrices-of-different-size-in-r