Correlation coefficients for sparse matrix in python?
Does anyone know how to compute a correlation matrix from a very large sparse matrix in python? Basically, I am looking for something like numpy.corrcoef that will work on a scipy sparse matrix. You can compute the correlation coefficients fairly straightforwardly from the covariance matrix like this: import numpy as np from scipy import sparse def sparse_corrcoef(A, B=None): if B is not None: A = sparse.vstack((A, B), format='csr') A = A.astype(np.float64) n = A.shape[1] # Compute the covariance matrix rowsum = A.sum(1) centering = rowsum.dot(rowsum.T.conjugate()) / n C = (A.dot(A.T.conjugate