sparse-matrix | 易学教程

The fastest way to calculate eigenvalues of large matrices

阅读更多关于 The fastest way to calculate eigenvalues of large matrices

问题 Until now I used numpy.linalg.eigvals to calculate the eigenvalues of quadratic matrices with at least 1000 rows/columns and, for most cases, about a fifth of its entries non-zero (I don't know if that should be considered a sparse matrix). I found another topic indicating that scipy can possibly do a better job. However, since I have to calculate the eigenvalues for hundreds of thousands of large matrices of increasing size (possibly up to 20000 rows/columns and yes, I need ALL of their

How to handle huge sparse matrices construction using Scipy?

阅读更多关于 How to handle huge sparse matrices construction using Scipy?

问题 So, I am working on a Wikipedia dump to compute the pageranks of around 5,700,000 pages give or take. The files are preprocessed and hence are not in XML. They are taken from http://haselgrove.id.au/wikipedia.htm and the format is: from_page(1): to(12) to(13) to(14).. from_page(2): to(21) to(22).. . . . from_page(5,700,000): to(xy) to(xz) so on. So. basically it's a construction of a [5,700,000*5,700,000] matrix, which would just break my 4 gigs of RAM. Since, it is very-very Sparse, that

How to convert co-occurrence matrix to sparse matrix

阅读更多关于 How to convert co-occurrence matrix to sparse matrix

问题 I am starting dealing with sparse matrices so I'm not really proficient on this topic. My problem is, I have a simple coo-occurrences matrix from a word list, just a 2-dimensional co-occurrence matrix word by word counting how many times a word occurs in same context. The matrix is quite sparse since the corpus is not that big. I want to convert it to a sparse matrix to be able to deal better with it, eventually do some matrix multiplication afterwards. Here what I have done until now (only

Computation on sparse data using GPU

阅读更多关于 Computation on sparse data using GPU

问题 I'm computing a function f ( x ) = exp(- x ) in Matlab, where x is a vector of scalars. The function is computed on GPU, e.g. x_cpu = [4 5 11 1]; x = gpuArray(x_cpu); f = exp(-x); then the result would be: f = exp(-[4, 5, 11, 1]) = [0.183, 0.0067, 1.6702e-005, 0.3679]. Note that f ( x (3)) = f (11) = exp(-11) = 1.6702e-005 = 0.000016702, which is a pretty small value. So, I would like to avoid computing the function for all x (i) > 10 by simply setting f ( x (i)) = 0. I can probably use the

Sparse matrix slicing memory error

阅读更多关于 Sparse matrix slicing memory error

I have a sparse matrix csr : <681881x58216 sparse matrix of type '<class 'numpy.int64'>' with 2867209 stored elements in Compressed Sparse Row format> And i want to create a new sparce matrix as a slice of csr : csr_2 = csr[1::2,:] . Problem: When i have csr matrix only, my server's RAM is busy with 40 GB. When i run the csr_2 = csr[1::2,:] , my server's RAM is being dumped completly for 128GB and it falls with "Memory error". sparse uses matrix multiplication to select rows like this. I worked out the details of the extractor matrix in another SO question, but roughly to get a (p, n) matrix

Most efficient way of accessing non-zero values in row/column in scipy.sparse matrix

阅读更多关于 Most efficient way of accessing non-zero values in row/column in scipy.sparse matrix

问题 What is the fastest or, failing that, least wordy way of accessing all non-zero values in a row row or column col of a scipy.sparse matrix A in CSR format? Would doing it in another format (say, COO ) be more efficient? Right now, I use the following: A[row, A[row, :].nonzero()[1]] or A[A[:, col].nonzero()[0], col] 回答1: For a problem like this is pays to understand the underlying data structures for the different formats: In [672]: A=sparse.csr_matrix(np.arange(24).reshape(4,6)) In [673]: A

Scipy sparse matrices element wise multiplication

阅读更多关于 Scipy sparse matrices element wise multiplication

问题 I am trying to do an element-wise multiplication for two large sparse matrices. Both are of size around (400K X 500K), with around 100M elements. However, they might not have non-zero elements in the same positions, and they might not have the same number of non-zero elements. In either situation, Im okay with multiplying the non-zero value of one matrix and the zero value in the other matrix to zero. I keep running out of memory (8GB) in every approach, which doesnt make much sense. I

How to find and name contiguous non-zero entries in a sparse matrix in R?

阅读更多关于 How to find and name contiguous non-zero entries in a sparse matrix in R?

问题 My problem is conceptually simple. I am looking for a computationally efficient solution of it (my own one I attach at the end). Suppose we have a potentially very large sparse matrix like the one on the left below and want to 'name' every area of contiguous non-zero elements with a separate code (see matrix on the right) 1 1 1 . . . . . 1 1 1 . . . . . 1 1 1 . 1 1 . . 1 1 1 . 4 4 . . 1 1 1 . 1 1 . . 1 1 1 . 4 4 . . . . . . 1 1 . . ---> . . . . 4 4 . . . . 1 1 . . 1 1 . . 3 3 . . 7 7 1 . 1 1

Given a matrix of type `scipy.sparse.coo_matrix` how to determine index and value of maximum of each row?

阅读更多关于 Given a matrix of type `scipy.sparse.coo_matrix` how to determine index and value of maximum of each row?

问题 Given a sparse matrix R of type scipy.sparse.coo_matrix of shape 1.000.000 x 70.000 I figured out that row_maximum = max(R.getrow(i).data) will give me the maximum value of the i-th row. What I need now is the index corresponding to the value row_maximum . Any ideas how to achieve that? Thanks for any advice in advance! 回答1: getrow(i) returns a 1 x n CSR matrix, which has an indices attribute that gives the row indices of the corresponding values in the data attribute. (We know the shape is 1

Efficient way of taking Logarithm function in a sparse matrix

阅读更多关于 Efficient way of taking Logarithm function in a sparse matrix

问题 I have a big sparse matrix. I want to take log4 for all element in that sparse matrix. I try to use numpy.log() but it doesn't work with matrices. I can also take logarithm row by row. Then I crush old row with a new one. # Assume A is a sparse matrix (Linked List Format) with float values as data # It is only for one row import numpy as np c = np.log(A.getrow(0)) / numpy.log(4) A[0, :] = c This was not as quick as I'd expected. Is there a faster way to do this? 回答1: You can modify the data