matrix-factorization | 易学教程

Very Large and Very Sparse Non Negative Matrix factorization

阅读更多关于 Very Large and Very Sparse Non Negative Matrix factorization

问题 I have a very large and also sparse matrix (531K x 315K), the number of total cells is ~167 Billion. The non-zero values are only 1s. Total number of non-zero values are around 45K. Is there an efficient NMF package to solve my problem? I know there are couple of packages for that and they are working well only for small size of data matrix. Any idea helps. Thanks in advance. 回答1: scikit-learn will handle this easily ! Code: from time import perf_counter as pc import numpy as np import scipy

Very Large and Very Sparse Non Negative Matrix factorization

阅读更多关于 Very Large and Very Sparse Non Negative Matrix factorization

Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

阅读更多关于 Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

问题 In my application I need to avoid dynamic memory allocation (malloc like) except in the class constructors. I have a sparse semidefinite matrix M whose elements change during the program execution but it mantains a fixed sparsity pattern. In order to solve many linear systems M * x = b as fast as possible, the idea is to use inplace decomposition in my class constructor as described in Inplace matrix decompositions, then call factorize method whenever M changes: struct MyClass { private:

Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

阅读更多关于 Avoiding dynamic memory allocation on factorizing sparse matrix with Eigen

Cholesky decomposition failure for my correlation matrix

阅读更多关于 Cholesky decomposition failure for my correlation matrix

问题 I am trying to use chol() to find the Cholesky decomposition of the correlation matrix below. Is there a maximum size I can use that function on? I am asking because I get the following: d <-chol(corrMat) Error in chol.default(corrMat) : the leading minor of order 61 is not positive definite but, I can decompose it for less than 60 elements without a problem (even when it contains the 61st element of the original): > d <-chol(corrMat[10:69, 10:69]) > d <-chol(corrMat[10:70, 10:70]) Error in

Correct use of pivot in Cholesky decomposition of positive semi-definite matrix

阅读更多关于 Correct use of pivot in Cholesky decomposition of positive semi-definite matrix

问题 I don't understand how to use the chol function in R to factor a positive semi-definite matrix. (Or I do, and there's a bug.) The documentation states: If pivot = TRUE, then the Choleski decomposition of a positive semi-definite x can be computed. The rank of x is returned as attr(Q, "rank"), subject to numerical errors. The pivot is returned as attr(Q, "pivot"). It is no longer the case that t(Q) %*% Q equals x. However, setting pivot <- attr(Q, "pivot") and oo <- order(pivot), it is true

Python Non negative Matrix Factorization that handles both zeros and missing data?

阅读更多关于 Python Non negative Matrix Factorization that handles both zeros and missing data?

问题 I look for a NMF implementation that has a python interface, and handles both missing data and zeros. I don't want to impute my missing values before starting the factorization, I want them to be ignored in the minimized function. It seems that neither scikit-learn, nor nimfa, nor graphlab, nor mahout propose such an option. Thanks! 回答1: Using this Matlab to python code conversion sheet I was able to rewrite NMF from Matlab toolbox library. I had to decompose a 40k X 1k matrix with sparsity

Cholesky decomposition of large sparse matrices in Java

阅读更多关于 Cholesky decomposition of large sparse matrices in Java

问题 I want to do Cholesky decomposition of large sparse matrices in Java. Currently I'm using the Parallel Colt library class SparseDoubleCholeskyDecomposition but it's much slower than using my code I wrote in C for dense matrices which I use in java with JNI. For example for 5570x5570 matrices with a non-zero density of 0.25% with SparseDoubleCholeskyDecomposition takes 26.6 seconds to factor and my own code for the same matrix using dense storage only takes 1.12 seconds . However, if I set the

MLlib MatrixFactorizationModel recommendProducts(user, num) failing on some users

阅读更多关于 MLlib MatrixFactorizationModel recommendProducts(user, num) failing on some users

问题 I trained a MatrixFactorizationModel model using ALS.train() and now using model.recommendProducts(user, num) to get the top recommended products, but the code fails on some users with the following error: user_products = model.call("recommendProducts", user, prodNum) File "/usr/lib/spark/python/pyspark/mllib/common.py", line 136, in call return callJavaFunc(self._sc, getattr(self._java_model, name), *a) File "/usr/lib/spark/python/pyspark/mllib/common.py", line 113, in callJavaFunc return

Why recommendProductsForUsers is not a member of org.apache.spark.mllib.recommendation.MatrixFactorizationModel

阅读更多关于 Why recommendProductsForUsers is not a member of org.apache.spark.mllib.recommendation.MatrixFactorizationModel

问题 i have build recommendations system using Spark with ALS collaboratife filtering mllib my snippet code : bestModel.get .predict(toBePredictedBroadcasted.value) evrything is ok, but i need change code for fullfilment requirement, i read from scala doc in here i need to use def recommendProducts but when i tried in my code : bestModel.get.recommendProductsForUsers(100) and error when compile : value recommendProductsForUsers is not a member of org.apache.spark.mllib.recommendation