How to calculate cumulative moving average in Python/SQLAlchemy/Flask

核能气质少年 提交于 2019-12-14 00:21:32

问题


I'll give some context so it makes sense. I'm capturing Customer Ratings for Products in a table (Rating) and want to be able to return a Cumulative Moving Average of the ratings based on time.

A basic example follows taking a rating per day:

02 FEB - Rating: 5 - Cum Avg: 5
03 FEB - Rating: 4 - Cum Avg: (5+4)/2 = 4.5
04 FEB - Rating: 1 - Cum Avg: (5+4+1)/3 = 3.3
05 FEB - Rating: 5 - Cum Avg: (5+4+1+5)/4 = 3.75
Etc...

I'm trying to think of an approach that won't scale horribly.

My current idea is to have a function that is tripped when a row is inserted into the Rating table that works out the Cum Avg based on the previous row for that product

So the fields would be something like:

TABLE: Rating
| RatingId | DateTime | ProdId | RatingVal | RatingCnt | CumAvg |

But this seems like a fairly dodgy way to store the data.

What would be the (or any) way to accomplish this? If I was to use the 'trigger' of sorts, how do you go about doing that in SQLAlchemy?

Any and all advice appreciated!


回答1:


I don't know about SQLAlchemy, but I might use an approach like this:

  • Store the cumulative average and rating count separately from individual ratings.
  • Every time you get a new rating, update the cumulative average and rating count:
    • new_count = old_count + 1
    • new_average = ((old_average * old_count) + new_rating) / new_count
  • Optionally, store a row for each new rating.

Updating the average and rating count could be done with a single SQL statement.




回答2:


I think you should store the MA in a 2 element list, it would be much more simple:

#first rating 5 is rating number 0
a = [5,0]

#next:
for i in rating:
a = [(a[0]*a[1]+lastRating)/(a[1]+1),a[1]+1]

Bye



来源:https://stackoverflow.com/questions/7157768/how-to-calculate-cumulative-moving-average-in-python-sqlalchemy-flask

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!