Calculating weighted average in Pandas using NumPy function

谁都会走 提交于 2021-01-27 23:12:00

问题


Assume we have a pandas dataframe like this:

a    b    id 
36   25   2
40   25   3
46   23   2
40   22   5
42   20   5
56   39   3

I would like to perform a operation (a div b), then group by id and finally calculate a weighted average, using "a" as weights. It work's when I only calculate the mean.

import pandas as pd
import numpy as np

df = pd.read_csv('file', sep='\s+')
a = (df['a'].div(df['b'])).groupby(df['id']).mean()           # work fine
b = (df['a'].div(df['b'])).groupby(df['dd']).apply(lambda x: np.average(x ??? ), weights=x['a']))

Don't know how to parse the values of df['a'].div(df['b'] to the first parameter of the numpy average function. Any ideas?

Expected Output:


   id  Weighted Average
0   2          1.754146
1   3          1.504274
2   5          1.962528

回答1:


Are you looking to group the weighted average by id ?

df.groupby('id').apply(lambda x: np.average(x['b'],weights=x['a'])).reset_index(name='Weighted Average')
Out[1]: 
   id  Weighted Average
0   2         23.878049
1   3         33.166667
2   5         20.975610

Or if you want to do the weighted average of a / b:

(df.groupby('id').apply(lambda x: np.average(x['a']/x['b'],weights=x['a']))
 .reset_index(name='Weighted Average'))
Out[2]: 
   id  Weighted Average
0   2          1.754146
1   3          1.504274
2   5          1.962528


来源:https://stackoverflow.com/questions/64236587/calculating-weighted-average-in-pandas-using-numpy-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!