Flooding Bayesian rating creates values out of range

扶醉桌前 提交于 2019-12-21 02:59:07

问题


I'm trying to apply the Bayesian rating formula, but if I rate 1 out of 5 thousand of hundreds, the final rating is greater than 5.

For example, a given item has no votes and after voting 170,000 times with 1 star, its final rating is 5.23. If I rate 100, it has a normal value.

Here is what I have in PHP.

<?php
// these values came from DB
$total_votes     = 2936;    // total of votes for all items
$total_rating    = 582.955; // sum of all ratings
$total_items     = 202;

// now the specific item, it has no votes yet
$this_num_votes  = 0;
$this_score      = 0;
$this_rating     = 0;

// simulating a lot of votes with 1 star
for ($i=0; $i < 170000; $i++) { 
    $rating_sent = 1; // the new rating, always 1

    $total_votes++; // adding 1 to total
    $total_rating = $total_rating+$rating_sent; // adding 1 to total

    $avg_num_votes = ($total_votes/$total_items); // Average number of votes in all items
    $avg_rating = ($total_rating/$total_items);   // Average rating for all items
    $this_num_votes = $this_num_votes+1;          // Number of votes for this item
    $this_score = $this_score+$rating_sent;       // Sum of all votes for this item
    $this_rating = $this_score/$this_num_votes;   // Rating for this item

    $bayesian_rating = ( ($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating) ) / ($avg_num_votes + $this_num_votes);
}
echo $bayesian_rating;
?>

Even if I flood with 1 or 2:

$rating_sent = rand(1,2)

The final rating after 100,000 votes is over 5.

I just did a new test using

$rating_sent = rand(1,5)

And after 100,000 I got a value completely out of range range (10.53). I know that in a normal situation no item will get 170,000 votes while all the other items get no vote. But I wonder if there is something wrong with my code or if this is an expected behavior of Bayesian formula considering the massive votes.

Edit

Just to make it clear, here is a better explanation for some variables.

$avg_num_votes   // SUM(votes given to all items)/COUNT(all items)
$avg_rating      // SUM(rating of all items)/COUNT(all items)
$this_num_votes  // COUNT(votes given for this item)
$this_score      // SUM(rating for this item)
$bayesian_rating // is the formula itself

The formula is: ( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes). Taken from here


回答1:


You need to divide by total_votes rather than total_items when calculating avg_rating.

I made the changes and got something that behaves much better here.

http://codepad.org/gSdrUhZ2



来源:https://stackoverflow.com/questions/6011241/flooding-bayesian-rating-creates-values-out-of-range

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!