Efficiently determining the probability of a user clicking a hyperlink

前端 未结 4 1581
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-20 10:46

So I have a bunch of hyperlinks on a web page. From past observation I know the probabilities that a user will click on each of these hyperlinks. I can therefore calculate

4条回答
  •  一生所求
    2020-12-20 11:17

    I made this a new answer since it's fundamentally different.

    This is based on Chris Bishop, Machine Learning and Pattern Recognition, Chapter 2 "Probability Distributions" p71++ and http://en.wikipedia.org/wiki/Beta_distribution.

    First we fit a beta distribution to the given mean and variance in order to build a distribution over the parametes. Then we return the mode of the distribution which is the expected parameter for a bernoulli variable.

    def estimate(prior_mean, prior_variance, clicks, views):
      c = ((prior_mean * (1 - prior_mean)) / prior_variance - 1)
      a = prior_mean * c
      b = (1 - prior_mean) * c
      return ((a + clicks) - 1) / (a + b + views - 2)
    

    However, I am quite positive that the prior mean/variance will not work for you since you throw away information about how many samples you have and how good your prior thus is.

    Instead: Given a set of (webpage, link_clicked) pairs, you can calculate the number of pages a specific link was clicked on. Let that be m. Let the amount of times that link was not clicked be l.

    Now let a be the number of clicks to your new link be a and the number of visits to the site be b. Then your probability of your new link is

    def estimate(m, l, a, b):
      (m + a) / (m + l + a + b)
    

    Which looks pretty trivial but actually has a valid probabilistic foundation. From the implementation perspective, you can keep m and l globally.

提交回复
热议问题