Hot content algorithm / score with time decay

守給你的承諾、 提交于 2019-11-28 15:15:15

Reddits old formula and a little drop off

Basically you can use Reddit's formula. Since your system only supports upvotes you could weight them, resulting in something like this:

def hotness(track)
    s = track.playedCount
    s = s + 2*track.downloadCount
    s = s + 3*track.likeCount
    s = s + 4*track.favCount
    baseScore = log(max(s,1))

    timeDiff = (now - track.uploaded).toWeeks

    if(timeDiff > 1)
        x = timeDiff - 1
        baseScore = baseScore * exp(-8*x*x)

    return baseScore

The factor exp(-8*x*x) will give you your desired drop off:

The basics behind

You can use any function that goes to zero faster than your score goes up. Since we use log on our score, even a linear function can get multiplied (as long as your score doesn't grow exponentially).

So all you need is a function that returns 1 as long as you don't want to modify the score, and drops afterwards. Our example above forms that function:

multiplier(x) = x > 1 ? exp(-8*x*x) : 1

You can vary the multiplier if you want less steep curves.

Example in C++

Lets say that the probability for a given track to be played in a given hour is 50%, download 10%, like 1% and favorite 0.1%. Then the following C++ program will give you an estimate for your scores behavior:

#include <iostream>
#include <fstream>
#include <random>
#include <ctime>
#include <cmath>

struct track{
    track() : uploadTime(0),playCount(0),downCount(0),likeCount(0),faveCount(0){}
    std::time_t uploadTime;    
    unsigned int playCount;
    unsigned int downCount;
    unsigned int likeCount;
    unsigned int faveCount;    
    void addPlay(unsigned int n = 1){ playCount += n;}
    void addDown(unsigned int n = 1){ downCount += n;}
    void addLike(unsigned int n = 1){ likeCount += n;}
    void addFave(unsigned int n = 1){ faveCount += n;}
    unsigned int baseScore(){
        return  playCount +
            2 * downCount +
            3 * likeCount +
            4 * faveCount;
    }
};

int main(){
    track test;
    const unsigned int dayLength = 24 * 3600;
    const unsigned int weekLength = dayLength * 7;    

    std::mt19937 gen(std::time(0));
    std::bernoulli_distribution playProb(0.5);
    std::bernoulli_distribution downProb(0.1);
    std::bernoulli_distribution likeProb(0.01);
    std::bernoulli_distribution faveProb(0.001);

    std::ofstream fakeRecord("fakeRecord.dat");
    std::ofstream fakeRecordDecay("fakeRecordDecay.dat");
    for(unsigned int i = 0; i < weekLength * 3; i += 3600){
        test.addPlay(playProb(gen));
        test.addDown(downProb(gen));
        test.addLike(likeProb(gen));
        test.addFave(faveProb(gen));    

        double baseScore = std::log(std::max<unsigned int>(1,test.baseScore()));
        double timePoint = static_cast<double>(i)/weekLength;        

        fakeRecord << timePoint << " " << baseScore << std::endl;
        if(timePoint > 1){
            double x = timePoint - 1;
            fakeRecordDecay << timePoint << " " << (baseScore * std::exp(-8*x*x)) << std::endl;
        }
        else
            fakeRecordDecay << timePoint << " " << baseScore << std::endl;
    }
    return 0;
}

Result:

This should be sufficient for you.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!