问题
I am trying to impute values in my dataset conditionally.
Say I have three columns, If Column 1 is 1 then Column 2 is 0 and Column 3 is 0; If column 1 is 2 then Column 2 is Mean () and Column 3 is Mean().
I tried running an if statement with the function any() and defined the conditions separately.
However the conditions are not being fulfilled based on conditions, I am either getting all mean values or all zeroes.
The exact code goes as below:
if (df['Retention_Term'] == 6):
df.cl_tot_calls_term_seq_1.replace(999, np.nan,inplace = True)
df['cl_tot_calls_term_seq_1'].fillna(df['cl_tot_calls_term_seq_1'].median(),inplace= True)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
回答1:
Try it like this.
mask1 = df['Retention_Term']==6
mask2 = df['cl_tot_calls_term_seq_1'] == 999
df.loc[mask1 & mask2, 'cl_tot_calls_term_seq_1'] = np.nan
Then the rest should be ok.
df['cl_tot_calls_term_seq_1'].fillna(df['cl_tot_calls_term_seq_1'].median(), inplace= True)
来源:https://stackoverflow.com/questions/61713051/conditional-data-imputation-in-python