sparse-matrix

Is there support for sparse matrices in Python?

萝らか妹 提交于 2019-11-29 06:25:07
问题 Is there support for sparse matrices in python? Possibly in numpy or in scipy? 回答1: Yes. SciPi provides scipy.sparse, a "2-D sparse matrix package for numeric data". There are seven available sparse matrix types: csc_matrix: Compressed Sparse Column format csr_matrix: Compressed Sparse Row format bsr_matrix: Block Sparse Row format lil_matrix: List of Lists format dok_matrix: Dictionary of Keys format coo_matrix: COOrdinate format (aka IJV, triplet format) dia_matrix: DIAgonal format 回答2:

Efficient way to normalize a Scipy Sparse Matrix

泄露秘密 提交于 2019-11-29 06:22:38
问题 I'd like to write a function that normalizes the rows of a large sparse matrix (such that they sum to one). from pylab import * import scipy.sparse as sp def normalize(W): z = W.sum(0) z[z < 1e-6] = 1e-6 return W / z[None,:] w = (rand(10,10)<0.1)*rand(10,10) w = sp.csr_matrix(w) w = normalize(w) However this gives the following exception: File "/usr/lib/python2.6/dist-packages/scipy/sparse/base.py", line 325, in __div__ return self.__truediv__(other) File "/usr/lib/python2.6/dist-packages

Access value, column index, and row_ptr data from scipy CSR sparse matrix

故事扮演 提交于 2019-11-29 03:34:26
I have a large matrix that I would like to convert to sparse CSR format. When I do: import scipy as sp Ks = sp.sparse.csr_matrix(A) print Ks Where A is dense, I get (0, 0) -2116689024.0 (0, 1) 394620032.0 (0, 2) -588142656.0 (0, 12) 1567432448.0 (0, 14) -36273164.0 (0, 24) 233332608.0 (0, 25) 23677192.0 (0, 26) -315783392.0 (0, 45) 157961968.0 (0, 46) 173632816.0 etc... I can get vectors of row index, column index, and value using: Knz = Ks.nonzero() sparserows = Knz[0] sparsecols = Knz[1] #The Non-Zero Value of K at each (Row,Col) vals = np.empty(sparserows.shape).astype(np.float) for i in

Apply PCA on very large sparse matrix

此生再无相见时 提交于 2019-11-29 03:18:20
I am doing a text classification task with R, and I obtain a document-term matrix with size 22490 by 120,000 (only 4 million non-zero entries, less than 1% entries). Now I want to reduce the dimensionality by utilizing PCA (Principal Component Analysis). Unfortunately, R cannot handle this huge matrix, so I store this sparse matrix in a file in the "Matrix Market Format", hoping to use some other techniques to do PCA. So could anyone give me some hints for useful libraries (whatever the programming language), which could do PCA with this large-scale matrix with ease, or do a longhand PCA by

Markov chain stationary distributions with scipy.sparse?

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-29 02:40:33
I have a Markov chain given as a large sparse scipy matrix A . (I've constructed the matrix in scipy.sparse.dok_matrix format, but converting to other ones or constructing it as csc_matrix are fine.) I'd like to know any stationary distribution p of this matrix, which is an eigenvector to the eigenvalue 1 . All entries in this eigenvector should be positive and add up to 1, in order to represent a probability distribution. This means I want any solution for the system (A-I) p = 0 , p.sum()=1 (where I=scipy.sparse.eye(*A.shape) is the idententy matrix), but (A-I) will not be of full rank, and

How to parallelize this Python for loop when using Numba

北城以北 提交于 2019-11-29 01:32:08
I'm using the Anaconda distribution of Python, together with Numba, and I've written the following Python function that multiplies a sparse matrix A (stored in a CSR format) by a dense vector x : @jit def csrMult( x, Adata, Aindices, Aindptr, Ashape ): numRowsA = Ashape[0] Ax = numpy.zeros( numRowsA ) for i in range( numRowsA ): Ax_i = 0.0 for dataIdx in range( Aindptr[i], Aindptr[i+1] ): j = Aindices[dataIdx] Ax_i += Adata[dataIdx] * x[j] Ax[i] = Ax_i return Ax Here A is a large scipy sparse matrix, >>> A.shape ( 56469, 39279 ) # having ~ 142,258,302 nonzero entries (so about 6.4% ) >>> type(

Computing sparse pairwise distance matrix in R

狂风中的少年 提交于 2019-11-29 00:46:36
问题 I have a NxM matrix and I want to compute the NxN matrix of Euclidean distances between the M points. In my problem, N is about 100,000. As I plan to use this matrix for a k-nearest neighbor algorithm, I only need to keep the k smallest distances, so the resulting NxN matrix is very sparse. This is in contrast to what comes out of dist() , for example, which would result in a dense matrix (and probably storage problems for my size N ). The packages for kNN that I've found so far ( knnflex ,

How to properly pass a scipy.sparse CSR matrix to a cython function?

淺唱寂寞╮ 提交于 2019-11-29 00:21:20
I need to pass a scipy.sparse CSR matrix to a cython function. How do I specify the type, as one would for a numpy array? Here is an example about how to quickly access the data from a coo_matrix using the properties row , col and data . The purpose of the example is just to show how to declare the data types and create the buffers (also adding the compiler directives that will usually give you a considerable boost)... #cython: boundscheck=False #cython: wraparound=False #cython: cdivision=True #cython: nonecheck=False import numpy as np from scipy.sparse import coo_matrix cimport numpy as np

sparse matrix library for C++ [closed]

坚强是说给别人听的谎言 提交于 2019-11-29 00:12:39
Is there any sparse matrix library that can do these: solve linear algebraic equations support operations like matrix-matrix/number multiplication/addition/subtraction,matrix transposition, get a row/column of a matrix,and so on matrix size could be 40k*40k or bigger,like 250k*250k fast can be used in Windows Can someone recommend some libraries for me? If you recommend, please tell me the advantages and disadvantages of it, and the reason why you recommend it. By the way,I have searched many sparse matrix libraries on the internet and tested some of them. I found that each of them only

Performing PCA on large sparse matrix by using sklearn

坚强是说给别人听的谎言 提交于 2019-11-28 22:57:32
问题 I am trying to apply PCA on huge sparse matrix, in the following link it says that randomizedPCA of sklearn can handle sparse matrix of scipy sparse format. Apply PCA on very large sparse matrix However, I always get error. Can someone point out what I am doing wrong. Input matrix 'X_train' contains numbers in float64: >>>type(X_train) <class 'scipy.sparse.csr.csr_matrix'> >>>X_train.shape (2365436, 1617899) >>>X_train.ndim 2 >>>X_train[0] <1x1617899 sparse matrix of type '<type 'numpy