Pandas: filling missing values by weighted average in each group

别说谁变了你拦得住时间么 提交于 2019-12-23 22:19:25

问题


I have a dataFrame where 'value'column has missing values. I'd like to filling missing values by weighted average within each 'name' group. There was post on how to fill the missing values by simple average in each group but not weighted average. Thanks a lot!

df = pd.DataFrame({'value': [1, np.nan, 3, 2, 3, 1, 3, np.nan, np.nan],'weight':[3,1,1,2,1,2,2,1,1], 'name': ['A','A', 'A','B','B','B', 'C','C','C']})


   name  value  weight
0    A    1.0       3
1    A    NaN       1
2    A    3.0       1
3    B    2.0       2
4    B    3.0       1
5    B    1.0       2
6    C    3.0       2
7    C    NaN       1
8    C    NaN       1

I'd like to fill in "NaN" with weighted value in each "name" group, i.e.

   name  value  weight
0    A    1.0       3
1    A    1.5       1
2    A    3.0       1
3    B    2.0       2
4    B    3.0       1
5    B    1.0       2
6    C    3.0       2
7    C    3.0       1
8    C    3.0       1

回答1:


You can group data frame by name, and use fillna method to fill the missing values with weighted average which can calculated with np.average with weights parameter:

df['value'] = (df.groupby('name', group_keys=False)
                 .apply(lambda g: g.value.fillna(np.average(g.dropna().value, weights=g.dropna().weight))))

df
#name   value   weight
#0  A    1.0    3
#1  A    1.5    1
#2  A    3.0    1
#3  B    2.0    2
#4  B    3.0    1
#5  B    1.0    2
#6  C    3.0    2
#7  C    3.0    1
#8  C    3.0    1

To make this less convoluted, define a fillValue function:

import numpy as np
import pandas as pd

def fillValue(g):
    gNotNull = g.dropna()
    wtAvg = np.average(gNotNull.value, weights=gNotNull.weight)
    return g.value.fillna(wtAvg)

df['value'] = df.groupby('name', group_keys=False).apply(fillValue)


来源:https://stackoverflow.com/questions/41782990/pandas-filling-missing-values-by-weighted-average-in-each-group

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!