sparse-matrix

How to replace the elements in a big sparse matrix?

浪尽此生 提交于 2020-01-05 08:46:47
问题 I have quite a big sparse matrix, about 150,000*150,000. I need to access into its rows, extract the non-zero elements and replace these values following the rule as as the code below: H = []; for i = 1: size(A,2) [a,b,c] = find(A(i,:)); % extract the rows if size(c,2)==1 % only 2 add = 0; elseif size(c,2) > 1 && any(c<2)== 0 % many 2s add = c; add(1) = -2; add(end) = 2; add(2:end-1) = 0; elseif size(c,2) > 1 && any(c<2)~= 0 % 2 and 1 k = find(diff(c)==-1); % find right 2 position add = c;

Graph.get_adjacency() is slow and the output is strange

痞子三分冷 提交于 2020-01-05 07:41:25
问题 Consider a graph object G in python-igraph 0.7. If I want the adjacency matrix A of G, I have to write A=G.get_adjacency() , but there are two problems: Even if G is sparse with 3000 nodes, A is generated in a long time on my commercial laptop. Is it possible that the creation of the adjacency matrix is so expensive? The output A is a Matrix object, so if I want to operate with the numpy module on A, I have to convert it first in a list and then in a numpy.matrix. Moreover if A is sparse I

Graph.get_adjacency() is slow and the output is strange

依然范特西╮ 提交于 2020-01-05 07:41:01
问题 Consider a graph object G in python-igraph 0.7. If I want the adjacency matrix A of G, I have to write A=G.get_adjacency() , but there are two problems: Even if G is sparse with 3000 nodes, A is generated in a long time on my commercial laptop. Is it possible that the creation of the adjacency matrix is so expensive? The output A is a Matrix object, so if I want to operate with the numpy module on A, I have to convert it first in a list and then in a numpy.matrix. Moreover if A is sparse I

Baseline correction for spectroscopic data

一笑奈何 提交于 2020-01-05 04:40:11
问题 I am working with Raman spectra, which often have a baseline superimposed with the actual information I am interested in. I therefore would like to estimate the baseline contribution. For this purpose, I implemented a solution from this question. I do like the solution described there, and the code given works fine on my data. A typical result for calculated data looks like this with the red and orange line being the baseline estimates: Typical result of baseline estimation with calculated

How to incrementally create an sparse matrix on python?

﹥>﹥吖頭↗ 提交于 2020-01-05 04:40:07
问题 I am creating a co-occurring matrix, which is of size 1M by 1M integer numbers. After the matrix is created, the only operation I will do on it is to get top N values per each row (or column. as it is a symmetric matrix). I have to create matrix as sparse to be able to fit it in memory. I read input data from a big file, and update co-occurance of two indexes (row, col) incrementally. The sample code for Sparse dok_matrix specifies that I should declare the size of matrix before hand. I know

Fortran function to overload multiplication between derived types with allocatable components

╄→尐↘猪︶ㄣ 提交于 2020-01-05 03:46:07
问题 Foreword In order to store banded matrices whose full counterparts can have both rows and columns indexed from indices other than 1 , I defined a derived data type as TYPE CDS REAL, DIMENSION(:,:), ALLOCATABLE :: matrix INTEGER, DIMENSION(2) :: lb, ub INTEGER :: ld, ud END TYPE CDS where CDS stands for compressed diagonal storage. Given the declaration TYPE(CDS) :: A , The rank-2 component matrix is supposed to contain, as columns, the diagonals of the actual full matrix (like here, except

Converting from sparse to dense to sparse again decreases density after constructing sparse matrix

故事扮演 提交于 2020-01-05 03:26:28
问题 I am using scipy to generate a sparse finite difference matrix, constructing it initially from block matrices and then editing the diagonal to account for boundary conditions. The resulting sparse matrix is of the BSR type. I have found that if I convert the matrix to a dense matrix and then back to a sparse matrix using the scipy.sparse.BSR_matrix function, I am left with a sparser matrix than before. Here is the code I use to generate the matrix: size = (4,4) xDiff = np.zeros((size[0]+1

Concatenate sparse matrix Eigen

自作多情 提交于 2020-01-04 17:56:39
问题 I have two sparse matrices in Eigen, and I would like to join them vertically into one. As an example the target of the code would be: SparseMatrix<double> matrix1; matrix1.resize(10, 10); SparseMatrix<double> matrix2; matrix2.resize(5, 10); SparseMatrix<double> MATRIX_JOIN; MATRIX_JOIN.resize(15, 10); MATRIX_JOIN << matrix1, matrix2; I found some solutions on a forum however, I wasn't able to implement it. What's the proper way to join the matrices vertically? Edit My implementation:

Hierarchical Clustering Large Sparse Distance Matrix R

我只是一个虾纸丫 提交于 2020-01-04 07:56:40
问题 I am attempting to perform fastclust on a very large set of distances, but running into a problem. I have a very large csv file (about 91 million rows so a for loop takes too long in R) of similarities between keywords (about 50,000 unique keywords) that when I read into a data.frame looks like: > df kwd1 kwd2 similarity a b 1 b a 1 c a 2 a c 2 It is a sparse list and I can convert it into a sparse matrix using sparseMatrix(): > myMatrix a b c a . . . b 1 . . c 2 . . However, when I attempt

Cosine similarity yields 'nan' values

回眸只為那壹抹淺笑 提交于 2020-01-04 07:27:43
问题 I was calculating a Cosine Similarity Matrix for sparse vectors, and the elements expected to be float numbers appeared to be 'nan'. 'visits' is a sparse matrix showing how many times each user has visited each website. This matrix used to have a shape 1 500 000 x 1500, but I converted it into sparse matrix, using coo_matrix().tocsc(). The task is to find out, how similar the websites are, so I decided to calculate the cosine metric between each two sites. Here is my code: cosine_distance