creating sparse matrix of unknown size

為{幸葍}努か 提交于 2019-12-11 07:28:04

问题


I have a text file with each line indicating an edge on a graph, for example

2 5 1

indicates an edge of weight 1 between nodes 2 and 5. I want to create a sparse adjacency matrix using these tuples. Typically, I'd initialize a sparse matrix as

G = scipy.sparse.lil_matrix((n,n))

where n is the number of nodes in the graph. But in this case, I do not know what 'n' is. Is there a more efficient way to create the matrix than looping over the lines of the file to find the max node index, creating the lil_matrix and then again looping over the file ? My current implementation is this:

n = 0
with open(gfile) as f:
    for line in f:
        temp = map(int,line.split())
        n = np.max([n,temp[0],temp[1]])
G = sp.lil_matrix((n,n))
with open(gfile) as f:
    for line in f:
        temp = map(int,line.split())
        G[temp[0],temp[1]] = temp[2]

回答1:


The original, and still prototypical, way of creating a sparse matrix is to collect all inputs in row, col, data arrays (or lists), and use coo_matrix to construct the matrix. Shape can be deduced from those inputs (maximum index values), or given as a parameter.

To adapt your code

row, col, data = [],[],[]
with open(gfile) as f:
    for line in f:
        temp = map(int,line.split())
        # G[temp[0],temp[1]] = temp[2]
        data.append(temp[2])
        row.append(temp[0])
        col.append(temp[1])
G = sparse.coo_matrix((data, (row,col))

List appends are at least as fast as line reads, and better than sparse matrix inserts, even lil (lil assignment involves list appends as well).

I suspect you could also do:

A = np.genfromtxt(gfile, dtype=int) # default white space delimiter
# A should now be a 2d 3 column array
G = sparse.coo_matrix((A[:,2], (A[:,0], A[:,1]))

That is read the whole file with genfromtxt or loadtxt and create the sparse matrix from the resulting columns.

(When I made sparse matrices in MATLAB years ago, I used this sort of data, col, row initialization, though with a clever use of indexing to assemble those arrays from finite element blocks without loops.)



来源:https://stackoverflow.com/questions/38820079/creating-sparse-matrix-of-unknown-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!