问题
Looking to get a continuous rolling mean of a dataframe.
df looks like this
index price
0 4
1 6
2 10
3 12
looking to get a continuous rolling of price
the goal is to have it look this a moving mean of all the prices.
index price mean
0 4 4
1 6 5
2 10 6.67
3 12 8
thank you in advance!
回答1:
you can use expanding:
df['mean'] = df.price.expanding().mean()
df
index price mean
0 4 4.000000
1 6 5.000000
2 10 6.666667
3 12 8.000000
回答2:
Welcome to SO: Hopefully people will soon remember you from prior SO posts, such as this one.
From your example, it seems that @Allen has given you code that produces the answer in your table. That said, this isn't exactly the same as a "rolling" mean. The expanding()
function Allen uses is taking the mean of the first row divided by n (which is 1), then adding rows 1 and 2 and dividing by n (which is now 2), and so on, so that the last row is (4+6+10+12)/4 = 8.
This last number could be the answer if the window you want for the rolling mean is 4, since that would indicate that you want a mean of 4 observations. However, if you keep moving forward with a window size 4, and start including rows 5, 6, 7... then the answer from expanding()
might differ from what you want. In effect, expanding()
is recording the mean of the entire series (price
in this case) as though it were receiving a new piece of data at each row. "Rolling", on the other hand, gives you a result from an aggregation of some window size.
Here's another option for doing rolling calculations: the rolling()
method in a pandas.dataframe
.
In your case, you would do:
df['rolling_mean'] = df.price.rolling(4).mean()
df
index price rolling_mean
0 4 nan
1 6 nan
2 10 nan
3 12 8.000000
Those nan
s are a result of the windowing: until there are enough rows to calculate the mean, the result is nan
. You could set a smaller window:
df['rolling_mean'] = df.price.rolling(2).mean()
df
index price rolling_mean
0 4 nan
1 6 5.000000
2 10 8.000000
3 12 11.00000
This shows the reduction in the nan
entries as well as the rolling function: it 's only averaging within the size-two window you provided. That results in a different df['rolling_mean']
value than when using df.price.expanding()
.
Note: you can get rid of the nan
by using .rolling(2, min_periods = 1)
, which tells the function the minimum number of defined values within a window that have to be present to calculate a result.
来源:https://stackoverflow.com/questions/59606531/how-to-get-a-continuous-rolling-mean-in-pandas