Drop row based on two columns conditions

浪尽此生 提交于 2019-12-20 06:18:57

问题


I have dataframe looks like this:

df
Data1   Data2   Data3
A       XX      AA
A       YY      AA
B       XX      BB
B       YY      CC
C       XX      DD
C       YY      DD
D       XX      EE
D       YY      FF

I want to delete all the row (column data3) based on two columns (data1 and data2) with the condition if the data on data3 is same the delete.

my expected result looks like this:

Data1   Data2   Data3
B       XX      BB
B       YY      CC
D       XX      EE
D       YY      FF

how to do it?


回答1:


Using groupby + transform with nunique

yd=df[df.groupby(['Data1']).Data3.transform('nunique').gt(1)].copy()
Out[506]: 
  Data1 Data2 Data3
2     B    XX    BB
3     B    YY    CC
6     D    XX    EE
7     D    YY    FF



回答2:


You can also use a groupby with nunique, and a selection of rows:

>>> group = df.groupby('Data1')['Data3'].nunique()
>>> df[df['Data1'].isin(group[group.gt(1)].index)]
  Data1 Data2 Data3
2     B    XX    BB
3     B    YY    CC
6     D    XX    EE
7     D    YY    FF
>>> 


来源:https://stackoverflow.com/questions/56519823/drop-row-based-on-two-columns-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!