问题
I want to fill the value of the nan in the given value as following:
df = pd.DataFrame({'A' : ['aa', 'bb', 'cc', 'aa'],
'B': ['xx', 'yy', 'zz','xx'],
'C': ['2', '3','8', np.nan]})
print (df)
A B C
aa xx 2
bb yy 3
cc zz 8
aa xx NaN
Expected Output:
A B C
aa xx 2
bb yy 3
cc zz 8
aa xx 2
Since column A and B have value 2 in the third column, therefore last row should also have 2 in the C column.
回答1:
Use GroupBy.ffill with DataFrame.sort_values and DataFrame.sort_index for NaN
s to end of groups:
df['C'] = df.sort_values(['A','B','C']).groupby(['A','B'])['C'].ffill().sort_index()
print (df)
A B C
0 aa xx 2
1 bb yy 3
2 cc zz 8
3 aa xx 2
Another solution with forward and back filling per groups:
df['C'] = df.groupby(['A','B'])['C'].apply(lambda x: x.ffill().bfill())
回答2:
try sort_values first to make Nan in last and then use group by with ffill()
df.sort_values(by=['C'],inplace=True)
df = df.groupby(['A','B']).ffill()
A B C
0 aa xx 2
1 bb yy 3
2 cc zz 8
3 aa xx 2
来源:https://stackoverflow.com/questions/57019220/fill-the-na-value-in-one-column-according-to-values-of-similar-columns