Panda dataframe conditional .mean() depending on values in certain column

前端未结

关注

 2  1792

终归单人心 2020-12-31 14:08

I\'m trying to create a new column which returns the mean of values from an existing column in the same df. However the mean should be computed based on a grouping in three

2条回答

情书的邮戳 (楼主)

2020-12-31 14:36
You can do it the way you intended by tweaking your code in the following way:
```
o2 = o2.set_index(['YEAR', 'daytype', 'hourtype'])

o2['premium'] = o2.groupby(level=['YEAR', 'daytype', 'hourtype'])['option_value'].mean()
```
Why the original error? As explained by John Galt, the data coming out of groupby().mean() is not the same shape (length) as the original DataFrame.

Pandas can handle this cleverly if you first start with the 'grouping columns' in the index. Then it knows how to propogate the mean data correctly.

John's solution follows the same logic, because groupby naturally puts the grouping columns in the index during execution.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...