Fill the NA value in one column according to values of similar columns

荒凉一梦 提交于 2020-06-23 04:27:25

问题


I want to fill the value of the nan in the given value as following:

df = pd.DataFrame({'A' : ['aa', 'bb', 'cc', 'aa'], 
                   'B': ['xx', 'yy', 'zz','xx'], 
                   'C': ['2', '3','8', np.nan]})
print (df)

A  B  C
aa xx 2
bb yy 3
cc zz 8
aa xx NaN  

Expected Output:

A  B  C
aa xx 2
bb yy 3
cc zz 8
aa xx 2

Since column A and B have value 2 in the third column, therefore last row should also have 2 in the C column.


回答1:


Use GroupBy.ffill with DataFrame.sort_values and DataFrame.sort_index for NaNs to end of groups:

df['C'] = df.sort_values(['A','B','C']).groupby(['A','B'])['C'].ffill().sort_index()
print (df)
    A   B  C
0  aa  xx  2
1  bb  yy  3
2  cc  zz  8
3  aa  xx  2

Another solution with forward and back filling per groups:

df['C'] = df.groupby(['A','B'])['C'].apply(lambda x: x.ffill().bfill())



回答2:


try sort_values first to make Nan in last and then use group by with ffill()

df.sort_values(by=['C'],inplace=True)
df = df.groupby(['A','B']).ffill()
    A   B   C
0   aa  xx  2
1   bb  yy  3
2   cc  zz  8
3   aa  xx  2


来源:https://stackoverflow.com/questions/57019220/fill-the-na-value-in-one-column-according-to-values-of-similar-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!