What\'s the rationale behind the formula used in the hive_trend_mapper.py program of this Hadoop tutorial on calculating Wikipedia trends?
There are actuall
another way to look at it is this:
suppose your page and my page are made at same day, and ur page gets total views about ten million, and mine about 1 million till some point. then suppose the slope at some point is a million for me, and 0.5 million for you. if u just use slope, then i win, but ur page already had more views per day at that point, urs were having 5 million, and mine 1 million, so that a million on mine still makes it 2 million, and urs is 5.5 million for that day. so may be this scaling concept is to try to adjust the results to show that ur page is also good as a trend setter, and its slope is less but it already was more popular, but the scaling is only a log factor, so doesnt seem too problematic to me.