问题
I have a text file with each line indicating an edge on a graph, for example
2 5 1
indicates an edge of weight 1 between nodes 2 and 5. I want to create a sparse adjacency matrix using these tuples. Typically, I'd initialize a sparse matrix as
G = scipy.sparse.lil_matrix((n,n))
where n is the number of nodes in the graph. But in this case, I do not know what 'n' is. Is there a more efficient way to create the matrix than looping over the lines of the file to find the max node index, creating the lil_matrix and then again looping over the file ? My current implementation is this:
n = 0
with open(gfile) as f:
for line in f:
temp = map(int,line.split())
n = np.max([n,temp[0],temp[1]])
G = sp.lil_matrix((n,n))
with open(gfile) as f:
for line in f:
temp = map(int,line.split())
G[temp[0],temp[1]] = temp[2]
回答1:
The original, and still prototypical, way of creating a sparse matrix is to collect all inputs in row, col, data
arrays (or lists), and use coo_matrix
to construct the matrix. Shape can be deduced from those inputs (maximum index values), or given as a parameter.
To adapt your code
row, col, data = [],[],[]
with open(gfile) as f:
for line in f:
temp = map(int,line.split())
# G[temp[0],temp[1]] = temp[2]
data.append(temp[2])
row.append(temp[0])
col.append(temp[1])
G = sparse.coo_matrix((data, (row,col))
List appends are at least as fast as line reads, and better than sparse matrix inserts, even lil
(lil
assignment involves list appends as well).
I suspect you could also do:
A = np.genfromtxt(gfile, dtype=int) # default white space delimiter
# A should now be a 2d 3 column array
G = sparse.coo_matrix((A[:,2], (A[:,0], A[:,1]))
That is read the whole file with genfromtxt
or loadtxt
and create the sparse matrix from the resulting columns.
(When I made sparse matrices in MATLAB years ago, I used this sort of data, col, row initialization, though with a clever use of indexing to assemble those arrays from finite element blocks without loops.)
来源:https://stackoverflow.com/questions/38820079/creating-sparse-matrix-of-unknown-size