tile operation to create a csr_matrix from one row of another csr_matrix

狂风中的少年 提交于 2019-12-11 11:14:23

问题


I have a csr_matrix 'a' type of sparse matrix. I want to perform an operation to create a new csr_matrix 'b' where each row of 'b' is same ith row of 'a'.

I think for normal numpy arrays it is possible using 'tile' operation. But I am not able to find the same for csr_matrix.

Making first a numpy matrix and converting to csr_matrix is not an option as the size of matrix is 10000 x 10000.


回答1:


I actually could get to answer which doesn't require creating full numpy matrix and is quite fast for my purpose. So adding it as answer if it's useful for people in future:

rows, cols = a.shape
b = scipy.sparse.csr_matrix((np.tile(a[2].data, rows), np.tile(a[2].indices, rows),
                           np.arange(0, rows*a[2].nnz + 1, a[2].nnz)), shape=a.shape)

This takes 2nd row of 'a' and tiles it to create 'b'.

Following is the timing test, seems quite fast for 10000x10000 matrix:

100 loops, best of 3: 2.24 ms per loop



回答2:


There is a blk format, that lets you create a new sparse matrix from a list of other matrices.

So for a start you could

 a1 = a[I,:]
 ll = [a1,a1,a1,a1]
 sparse.blk_matrix(ll)

I don't have a shell running to test this.

Internally this format turns all input arrays into coo format, and collects their coo attributes into 3 large lists (or arrays). In your case of tiled rows, the data and col (j) values would just repeat. The row (I) values would step.

Another way to approach it would be to construct a small test matrix, and look at the attributes. What kinds of repetition do you see? It's easy to see patterns in the cooformat. lil might also be easy to replicate, maybe with the list *n operation. csr is trickier to understand.



来源:https://stackoverflow.com/questions/36613458/tile-operation-to-create-a-csr-matrix-from-one-row-of-another-csr-matrix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!