collaborative-filtering

How can I handle new users/items in model generated by Spark ALS from MLlib?

阅读更多关于 How can I handle new users/items in model generated by Spark ALS from MLlib?

问题 currently when a new user comes I cannot update my recommender system which apprently is related to not having added the user and item matrix. Where can I find this and how to do this? Thanks model.userFactors model.itemFactors 回答1: When items features and users features are computed the model is prepared only to recommend for known items and users. If You have new user/item, You have to cope with cold start problem. But there are two things - making recommendations work for new users/items

Model creation for User User collanborative filtering

阅读更多关于 Model creation for User User collanborative filtering

问题 I want to do a sort of user-user collaborative filtering wherein the users in the user-item matrix are a selected part of whole users in the database. These selected users are refreshed regularly with newly selected users preferences. New users shouldn't be added to the matrix. For a new user, based on his preferences we need to recommend items from the user-item matrix (which has only a part of users which are selected). I do not want to add the new anonymous users to the matrix. Explored in

Generating test set for recommendation engine

阅读更多关于 Generating test set for recommendation engine

问题 I am working on a recommendation engine based on implicit feedback. I was using this link : http://insightdatascience.com/blog/explicit_matrix_factorization.html#movielens This used ALS(Alternating least squares) to compute the user and item vectors. Since, my data set cannot be partitioned by time. I am randomly taking 'x' number of ratings from a user and putting them into the test set. This is a reproducible example of my training user-item matrix. col1 col2 col3 col4 col5 col6 col7 col8

Recommendation System for a book store application

阅读更多关于 Recommendation System for a book store application

问题 Hey I'm trying to learn some of the recommendation algorithms that's being used in websites like Amazon.com. So I have this simple java (spring hibernate postgres) book store application where in Book has the attributes title, category, tags, author. For simplicity there's no content inside the book. A book has to be identified by its title, category, author and tags. For each user logging into the application I should be able to recommend some books. Each user can view a book, add them to

Apache Spark ALS collaborative filtering results. They don't make sense

阅读更多关于 Apache Spark ALS collaborative filtering results. They don't make sense

问题 I wanted to try out Spark for collaborative filtering using MLlib as explained in this tutorial: https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html The algorithm is based on the paper "Collaborative Filtering for Implicit Feedback Datasets", doing matrix factorization. Everything is up and running using the 10 million Movielens data set. The data set it split into 80% training 10% test and 10% validation. RMSE Baseline: 1.060505464225402 RMSE (train) = 0

Similarity function for Mahout boolean user-based recommender

阅读更多关于 Similarity function for Mahout boolean user-based recommender

问题 I am using Mahout to build a user-based recommendation system which operates with boolean data. I use GenericBooleanPrefUserBasedRecommender , NearestNUserNeighborhood and now trying to decide about the most suitable user similarity function. It was suggested to use either LogLikelihoodSimilarity or TanimotoCoefficientSimilarity . I tried both and am getting [subjectively evaluated] meaningful results in both cases. However the RMSE rating for the same data set is better the LogLikehood. The

Similarity function for Mahout boolean user-based recommender

阅读更多关于 Similarity function for Mahout boolean user-based recommender

I am using Mahout to build a user-based recommendation system which operates with boolean data. I use GenericBooleanPrefUserBasedRecommender , NearestNUserNeighborhood and now trying to decide about the most suitable user similarity function. It was suggested to use either LogLikelihoodSimilarity or TanimotoCoefficientSimilarity . I tried both and am getting [subjectively evaluated] meaningful results in both cases. However the RMSE rating for the same data set is better the LogLikehood. The number of "no recommendation" is similar in both case. Can anyone recommend which of these similarity

What is algorithm behind the recommendation sites like last.fm, grooveshark, pandora?

阅读更多关于 What is algorithm behind the recommendation sites like last.fm, grooveshark, pandora?

问题 I am thinking of starting a project which is based on recommandation system. I need to improve myself at this area which looks like a hot topic on the web side. Also wondering what is the algorithm lastfm, grooveshark, pandora using for their recommendation system. If you know any book, site or any resource for this kind of algorithms please inform. 回答1: Have a look at Collaborative filtering or Recommender systems. One simple algorithm is Slope One. 回答2: A fashionably late response: Pandora

Spark mllib : how to convert string categorical features into int for Rating to accept

阅读更多关于 Spark mllib : how to convert string categorical features into int for Rating to accept

问题 I want to build a recommendation application using spark mllib and the ALS algorithm in collaborative filtering technique. My data set has the user and product features in string form like : [{"user":"StringName1", "product":"StringProduct1", "rating":1}, {"user":"StringName2", "product":"StringProduct2", "rating":2}, {"user":"StringName1", "product":"StringProduct2", "rating":3},..] But the Rating method seems to accept only int values for both user and product features. Does that mean I

Recommendation System for a book store application

阅读更多关于 Recommendation System for a book store application

Hey I'm trying to learn some of the recommendation algorithms that's being used in websites like Amazon.com. So I have this simple java (spring hibernate postgres) book store application where in Book has the attributes title, category, tags, author. For simplicity there's no content inside the book. A book has to be identified by its title, category, author and tags. For each user logging into the application I should be able to recommend some books. Each user can view a book, add them to cart and buy it anytime. So in the database I'm storing how many times each user looked at a book, the