sparse-matrix

Minimum Tile Ordering

女生的网名这么多〃 提交于 2019-12-04 09:43:27
问题 Minimizing Tile Re-ordering Problem: Suppose I had the following symmetric 9x9 matrix, N^2 interactions between N particles: (1,2) (2,9) (4,5) (4,6) (5,8) (7,8), These are symmetric interactions, so it implicitly implies that there exists: (2,1) (9,2) (5,4) (6,4) (8,5) (8,7), In my problem, suppose they are arranged in matrix form, where only the upper triangle is shown: t 0 1 2 (tiles) # 1 2 3 4 5 6 7 8 9 1 [ 0 1 0 0 0 0 0 0 0 ] 0 2 [ x 0 0 0 0 0 0 0 1 ] 3 [ x x 0 0 0 0 0 0 0 ] 4 [ x x x 0 1

In R, when using named rows, can a sparse matrix column be added (concatenated) to another sparse matrix?

无人久伴 提交于 2019-12-04 09:33:30
I have two sparse matrices, m1 and m2 : > m1 <- Matrix(data=0,nrow=2, ncol=1, sparse=TRUE, dimnames=list(c("b","d"),NULL)) > m2 <- Matrix(data=0,nrow=2, ncol=1, sparse=TRUE, dimnames=list(c("a","b"),NULL)) > m1["b",1]<- 4 > m2["a",1]<- 5 > m1 2 x 1 sparse Matrix of class "dgCMatrix" b 4 d . > m2 2 x 1 sparse Matrix of class "dgCMatrix" a 5 b . > and I want to cbind() them to make a sparse matrix like: [,1] [,2] a . 5 b 4 . d . . however cbind() ignores the named rows: > cbind(m1[,1],m2[,1]) [,1] [,2] b 4 5 d 0 0 is there some way to do this without a brute force loop? You should send the

How to find and name contiguous non-zero entries in a sparse matrix in R?

我与影子孤独终老i 提交于 2019-12-04 09:09:20
My problem is conceptually simple. I am looking for a computationally efficient solution of it (my own one I attach at the end). Suppose we have a potentially very large sparse matrix like the one on the left below and want to 'name' every area of contiguous non-zero elements with a separate code (see matrix on the right) 1 1 1 . . . . . 1 1 1 . . . . . 1 1 1 . 1 1 . . 1 1 1 . 4 4 . . 1 1 1 . 1 1 . . 1 1 1 . 4 4 . . . . . . 1 1 . . ---> . . . . 4 4 . . . . 1 1 . . 1 1 . . 3 3 . . 7 7 1 . 1 1 . . 1 1 2 . 3 3 . . 7 7 1 . . . 1 . . . 2 . . . 5 . . . 1 . . . . 1 1 1 2 . . . . 6 6 6 In my

CSR Matrix - Matrix multiplication

*爱你&永不变心* 提交于 2019-12-04 07:52:04
I have two square matrices A and B I must convert B to CSR Format and determine the product C A * B_csr = C I have found a lot of information online regarding CSR Matrix - Vector multiplication . The algorithm is: for (k = 0; k < N; k = k + 1) result[i] = 0; for (i = 0; i < N; i = i + 1) { for (k = RowPtr[i]; k < RowPtr[i+1]; k = k + 1) { result[i] = result[i] + Val[k]*d[Col[k]]; } } However, I require Matrix - Matrix multiplication. Further, it seems that most algorithms apply A_csr - vector multiplication where I require A * B_csr . My solution is to transpose the two matrices before

How to build a sparse matrix in PySpark?

我的梦境 提交于 2019-12-04 07:08:14
I am new to Spark. I would like to make a sparse matrix a user-id item-id matrix specifically for a recommendation engine. I know how I would do this in python. How does one do this in PySpark? Here is how I would have done it in matrix. The table looks like this now. Session ID| Item ID | Rating 1 2 1 1 3 5 import numpy as np data=df[['session_id','item_id','rating']].values data rows, row_pos = np.unique(data[:, 0], return_inverse=True) cols, col_pos = np.unique(data[:, 1], return_inverse=True) pivot_table = np.zeros((len(rows), len(cols)), dtype=data.dtype) pivot_table[row_pos, col_pos] =

kNN with big sparse matrices in Python

烈酒焚心 提交于 2019-12-04 06:40:22
I have two large sparse matrices: In [3]: trainX Out[3]: <6034195x755258 sparse matrix of type '<type 'numpy.float64'>' with 286674296 stored elements in Compressed Sparse Row format> In [4]: testX Out[4]: <2013337x755258 sparse matrix of type '<type 'numpy.float64'>' with 95423596 stored elements in Compressed Sparse Row format> About 5 GB RAM in total to load. Note these matrices are HIGHLY sparse (0.0062% occupied). For each row in testX , I want to find the Nearest Neighbor in trainX and return its corresponding label, found in trainY . trainY is a list with the same length as trainX and

Efficiently accumulating a collection of sparse scipy matrices

断了今生、忘了曾经 提交于 2019-12-04 04:26:29
I've got a collection of O(N) NxN scipy.sparse.csr_matrix , and each sparse matrix has on the order of N elements set. I want to add all these matrices together to get a regular NxN numpy array. (N is on the order of 1000). The arrangement of non-zero elements within the matrices is such that the resulting sum certainly isn't sparse (virtually no zero elements left in fact). At the moment I'm just doing reduce(lambda x,y: x+y,[m.toarray() for m in my_sparse_matrices]) which works but is a bit slow: of course the sheer amount of pointless processing of zeros which is going on there is

How can I find indices of each row of a matrix which has a duplicate in matlab?

拈花ヽ惹草 提交于 2019-12-04 04:06:57
I want to find the indices all the rows of a matrix which have duplicates. For example A = [1 2 3 4 1 2 3 4 2 3 4 5 1 2 3 4 6 5 4 3] The vector to be returned would be [1,2,4] A lot of similar questions suggest using the unique function, which I've tried but the closest I can get to what I want is: [C, ia, ic] = unique(A, 'rows') ia = [1 3 5] m = 5; setdiff(1:m,ia) = [2,4] But using unique I can only extract the 2nd,3rd,4th...etc instance of a row, and I need to also obtain the first. Is there any way I can do this? NB: It must be a method which doesn't involve looping through the rows, as I'm

Substitute for numpy broadcasting using scipy.sparse.csc_matrix

守給你的承諾、 提交于 2019-12-04 03:49:18
I have in my code the following expression: a = (b / x[:, np.newaxis]).sum(axis=1) where b is an ndarray of shape (M, N) , and x is an ndarray of shape (M,) . Now, b is actually sparse, so for memory efficiency I would like to substitute in a scipy.sparse.csc_matrix or csr_matrix . However, broadcasting in this way is not implemented (even though division or multiplication is guaranteed to maintain sparsity) (the entries of x are non-zero), and raises a NotImplementedError . Is there a sparse function I'm not aware of that would do what I want? ( dot() would sum along the wrong axis.) If b is

Scipy's sparse eigsh() for small eigenvalues

不想你离开。 提交于 2019-12-04 03:40:37
I'm trying to write a spectral clustering algorithm using NumPy/SciPy for larger (but still tractable) systems, making use of SciPy's sparse linear algebra library. Unfortunately, I'm running into stability issues with eigsh() . Here's my code: import numpy as np import scipy.sparse import scipy.sparse.linalg as SLA import sklearn.utils.graph as graph W = self._sparse_rbf_kernel(self.X_, self.datashape) D = scipy.sparse.csc_matrix(np.diag(np.array(W.sum(axis = 0))[0])) L = graph.graph_laplacian(W) # D - W vals, vects = SLA.eigsh(L, k = self.k, M = D, which = 'SM', sigma = 0, maxiter = 1000)