I have a simple dataframe as such:
df = [ {\'col1\' : \'A\', \'col2\': \'B\', \'col3\': \'C\', \'col4\':\'0\'},
{\'col1\' : \'M\', \'col2\':
You can use the duplicated
method to return a boolean indexer of whether elements are duplicates or not:
In [214]: pd.Series(['M', '0', 'M', '0']).duplicated()
Out[214]:
0 False
1 False
2 True
3 True
dtype: bool
Then you could create a mask by mapping this across the rows of your dataframe, and using where
to perform your substitution:
is_duplicate = df.apply(pd.Series.duplicated, axis=1)
df.where(~is_duplicate, 0)
col1 col2 col3 col4
0 A B C 0
1 M 0 0 0
2 B 0 0 0
3 X 0 Y 0