问题
I have a df like this:
A | B | C | D
14 | 5 | 10 | 5
4 | 7 | 15 | 6
100 | 220 | 6 | 7
For each row in column A,B,C, I want the find the max value and from it subtract column D and replace it.
Expected result:
A | B | C | D
9 | 5 | 10 | 5
4 | 7 | 9 | 6
100 | 213 | 6 | 7
So for the first row, it would select 14(the max out of 14,5,10), subtract column D from it (14-5 =9) and replace the result(replace initial value 14 with 9)
I know how to find the max value of A,B,C and from it subctract D, but I am stucked on the replacing part.
I tought on putting the result in another column called E, and then find again the max of A,B,C and replace with column E, but that would make no sense since I would be attempting to assign a value to a function call. Is there any other option to do this?
#Exmaple df
list_columns = ['A', 'B', 'C','D']
list_data = [ [14, 5, 10,5],[4, 7, 15,6],[100, 220, 6,7]]
df= pd.DataFrame(columns=list_columns, data=list_data)
#Calculate the max and subctract
df['e'] = df[['A', 'B']].max(axis=1) - df['D']
#To replace, maybe something like this. But this line makes no sense since it's backwards
df[['A', 'B','C']].max(axis=1) = df['D']
回答1:
Use DataFrame.mask for replace only maximal value matched by compare all values of filtered columns with maximals:
cols = ['A', 'B', 'C']
s = df[cols].max(axis=1)
df[cols] = df[cols].mask(df[cols].eq(s, axis=0), s - df['D'], axis=0)
print (df)
A B C D
0 9 5 10 5
1 4 7 9 6
2 100 213 6 7
来源:https://stackoverflow.com/questions/65846579/pandas-find-max-column-subtract-from-another-column-and-replace-the-value