What is the best way to compute trending topics or tags?

后端 未结 11 2027
太阳男子
太阳男子 2020-12-04 04:34

Many sites offer some statistics like \"The hottest topics in the last 24h\". For example, Topix.com shows this in its section \"News Trends\". There, you can see the topics

11条回答
  •  北荒
    北荒 (楼主)
    2020-12-04 04:48

    probably a simple gradient of topic frequency would work -- large positive gradient = growing quickly in popularity.

    the easiest way would be to bin the number of searched each day, so you have something like

    searches = [ 10, 7, 14, 8, 9, 12, 55, 104, 100 ]
    

    and then find out how much it changed from day to day:

    hot_factor = [ b-a for a, b in zip(searches[:-1], searches[1:]) ]
    # hot_factor is [ -3, 7, -6, 1, 3, 43, 49, -4 ]
    

    and just apply some sort of threshold so that days where the increase was > 50 are considered 'hot'. you could make this far more complicated if you'd like, too. rather than absolute difference you can take the relative difference so that going from 100 to 150 is considered hot, but 1000 to 1050 isn't. or a more complicated gradient that takes into account trends over more than just one day to the next.

提交回复
热议问题