MLLib spark -ALStrainImplicit value more than 1 [duplicate]

蓝咒 提交于 2019-12-12 03:36:23

问题


Experimenting with Spark mllib ALS("trainImplicit") for a while now. Would like to understand

1.why Im getting ratings value more than 1 in the predictions?

2.Is there any need for normalizing the user-product input?

sample result:

[Rating(user=316017, product=114019, rating=3.1923),

Rating(user=316017, product=41930, rating=2.0146997092620897) ]

In the documentation, it is mentioned that the predicted rating values will be somewhere around 0-1. I know that the ratings values can still be used in recommendations but it would be great if I know the reason.


回答1:


The cost function in ALS trainImplicit() doesn't impose any condition on predicted rating values as it takes the magnitude of difference from 0/1. So, you may also find some negative values there. That is why it says the predicted values are around [0,1] not necessarily in it.

There is one option to set non-negative factorization only, so that you never get a negative value in predicted rating or feature matrices, but that seemed to drop the performance for our case.



来源:https://stackoverflow.com/questions/36275600/mllib-spark-alstrainimplicit-value-more-than-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!