Replace duplicate values across columns in Pandas

前端 未结 1 538
灰色年华
灰色年华 2020-12-06 11:10

I have a simple dataframe as such:

df = [    {\'col1\' : \'A\', \'col2\': \'B\', \'col3\':   \'C\', \'col4\':\'0\'},
          {\'col1\' : \'M\', \'col2\':         


        
相关标签:
1条回答
  • 2020-12-06 11:43

    You can use the duplicated method to return a boolean indexer of whether elements are duplicates or not:

    In [214]: pd.Series(['M', '0', 'M', '0']).duplicated()
    Out[214]:
    0    False
    1    False
    2     True
    3     True
    dtype: bool
    

    Then you could create a mask by mapping this across the rows of your dataframe, and using where to perform your substitution:

    is_duplicate = df.apply(pd.Series.duplicated, axis=1)
    df.where(~is_duplicate, 0)
    
      col1 col2 col3 col4
    0    A    B    C    0
    1    M    0    0    0
    2    B    0    0    0
    3    X    0    Y    0
    
    0 讨论(0)
提交回复
热议问题