I have dataframe looks like this:
df
Data1 Data2 Data3
A XX AA
A YY AA
B XX BB
B YY CC
C XX DD
C YY DD
D XX EE
D YY FF
I want to delete all the row (column data3) based on two columns (data1 and data2) with the condition if the data on data3 is same the delete.
my expected result looks like this:
Data1 Data2 Data3
B XX BB
B YY CC
D XX EE
D YY FF
how to do it?
Using groupby + transform with nunique
yd=df[df.groupby(['Data1']).Data3.transform('nunique').gt(1)].copy()
Out[506]:
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
You can also use a groupby with nunique, and a selection of rows:
>>> group = df.groupby('Data1')['Data3'].nunique()
>>> df[df['Data1'].isin(group[group.gt(1)].index)]
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
>>>
来源:https://stackoverflow.com/questions/56519823/drop-row-based-on-two-columns-conditions