问题
I have a dataframe like this.
I am trying to remove the string which presents in substring column.
Main substring
Sri playnig well cricket cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US UK,US
My expected outupt is,
Main substring
Sri playnig well cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US
I tried df["Main"].str.reduce(df["substring"])
but not working, pls help.
回答1:
This is one way using pd.DataFrame.apply
. Note that np.nan == np.nan
evaluates to False
, we can use this trick in our function to determine when to apply removal logic.
import pandas as pd, numpy as np
df = pd.DataFrame({'Main': ['Sri playnig well cricket', 'sri went out',
'Ram is in' ,'Ram went to UK,US'],
'substring': ['cricket', np.nan, np.nan, 'UK,US']})
def remover(row):
sub = row['substring']
if sub != sub:
return row['Main']
else:
lst = row['Main'].split()
return ' '.join([i for i in lst if i!=sub])
df['Main'] = df.apply(remover, axis=1)
print(df)
Main substring
0 Sri playnig well cricket
1 sri went out NaN
2 Ram is in NaN
3 Ram went to UK,US
回答2:
This one-liner should do it:
df.loc[df['substring'].notnull(), 'Main'] = df.loc[df['substring'].notnull()].apply(lambda x: x['Main'].replace(x['substring'], ''), axis=1)
来源:https://stackoverflow.com/questions/50230233/how-to-reduce-part-of-a-dataframe-colunm-value-based-on-another-column