matrix-factorization

Very Large and Very Sparse Non Negative Matrix factorization

て烟熏妆下的殇ゞ 提交于 2021-02-07 07:19:04
问题 I have a very large and also sparse matrix (531K x 315K), the number of total cells is ~167 Billion. The non-zero values are only 1s. Total number of non-zero values are around 45K. Is there an efficient NMF package to solve my problem? I know there are couple of packages for that and they are working well only for small size of data matrix. Any idea helps. Thanks in advance. 回答1: scikit-learn will handle this easily ! Code: from time import perf_counter as pc import numpy as np import scipy

Very Large and Very Sparse Non Negative Matrix factorization

白昼怎懂夜的黑 提交于 2021-02-07 07:18:15
问题 I have a very large and also sparse matrix (531K x 315K), the number of total cells is ~167 Billion. The non-zero values are only 1s. Total number of non-zero values are around 45K. Is there an efficient NMF package to solve my problem? I know there are couple of packages for that and they are working well only for small size of data matrix. Any idea helps. Thanks in advance. 回答1: scikit-learn will handle this easily ! Code: from time import perf_counter as pc import numpy as np import scipy

Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

南楼画角 提交于 2021-01-29 05:11:17
问题 In my application I need to avoid dynamic memory allocation (malloc like) except in the class constructors. I have a sparse semidefinite matrix M whose elements change during the program execution but it mantains a fixed sparsity pattern. In order to solve many linear systems M * x = b as fast as possible, the idea is to use inplace decomposition in my class constructor as described in Inplace matrix decompositions, then call factorize method whenever M changes: struct MyClass { private:

Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

一笑奈何 提交于 2021-01-29 05:06:59
问题 In my application I need to avoid dynamic memory allocation (malloc like) except in the class constructors. I have a sparse semidefinite matrix M whose elements change during the program execution but it mantains a fixed sparsity pattern. In order to solve many linear systems M * x = b as fast as possible, the idea is to use inplace decomposition in my class constructor as described in Inplace matrix decompositions, then call factorize method whenever M changes: struct MyClass { private:

Cholesky decomposition failure for my correlation matrix

末鹿安然 提交于 2021-01-28 06:10:05
问题 I am trying to use chol() to find the Cholesky decomposition of the correlation matrix below. Is there a maximum size I can use that function on? I am asking because I get the following: d <-chol(corrMat) Error in chol.default(corrMat) : the leading minor of order 61 is not positive definite but, I can decompose it for less than 60 elements without a problem (even when it contains the 61st element of the original): > d <-chol(corrMat[10:69, 10:69]) > d <-chol(corrMat[10:70, 10:70]) Error in

Correct use of pivot in Cholesky decomposition of positive semi-definite matrix

主宰稳场 提交于 2020-01-02 03:42:13
问题 I don't understand how to use the chol function in R to factor a positive semi-definite matrix. (Or I do, and there's a bug.) The documentation states: If pivot = TRUE, then the Choleski decomposition of a positive semi-definite x can be computed. The rank of x is returned as attr(Q, "rank"), subject to numerical errors. The pivot is returned as attr(Q, "pivot"). It is no longer the case that t(Q) %*% Q equals x. However, setting pivot <- attr(Q, "pivot") and oo <- order(pivot), it is true

Python Non negative Matrix Factorization that handles both zeros and missing data?

笑着哭i 提交于 2019-12-29 17:51:13
问题 I look for a NMF implementation that has a python interface, and handles both missing data and zeros. I don't want to impute my missing values before starting the factorization, I want them to be ignored in the minimized function. It seems that neither scikit-learn, nor nimfa, nor graphlab, nor mahout propose such an option. Thanks! 回答1: Using this Matlab to python code conversion sheet I was able to rewrite NMF from Matlab toolbox library. I had to decompose a 40k X 1k matrix with sparsity

Cholesky decomposition of large sparse matrices in Java

笑着哭i 提交于 2019-12-24 15:27:24
问题 I want to do Cholesky decomposition of large sparse matrices in Java. Currently I'm using the Parallel Colt library class SparseDoubleCholeskyDecomposition but it's much slower than using my code I wrote in C for dense matrices which I use in java with JNI. For example for 5570x5570 matrices with a non-zero density of 0.25% with SparseDoubleCholeskyDecomposition takes 26.6 seconds to factor and my own code for the same matrix using dense storage only takes 1.12 seconds . However, if I set the

MLlib MatrixFactorizationModel recommendProducts(user, num) failing on some users

耗尽温柔 提交于 2019-12-23 12:26:05
问题 I trained a MatrixFactorizationModel model using ALS.train() and now using model.recommendProducts(user, num) to get the top recommended products, but the code fails on some users with the following error: user_products = model.call("recommendProducts", user, prodNum) File "/usr/lib/spark/python/pyspark/mllib/common.py", line 136, in call return callJavaFunc(self._sc, getattr(self._java_model, name), *a) File "/usr/lib/spark/python/pyspark/mllib/common.py", line 113, in callJavaFunc return

Why recommendProductsForUsers is not a member of org.apache.spark.mllib.recommendation.MatrixFactorizationModel

南楼画角 提交于 2019-12-23 06:29:55
问题 i have build recommendations system using Spark with ALS collaboratife filtering mllib my snippet code : bestModel.get .predict(toBePredictedBroadcasted.value) evrything is ok, but i need change code for fullfilment requirement, i read from scala doc in here i need to use def recommendProducts but when i tried in my code : bestModel.get.recommendProductsForUsers(100) and error when compile : value recommendProductsForUsers is not a member of org.apache.spark.mllib.recommendation