问题
I have dataframe
looks like this:
df
Data1 Data2 Data3
A XX AA
A YY AA
B XX BB
B YY CC
C XX DD
C YY DD
D XX EE
D YY FF
I want to delete all the row (column data3) based on two columns (data1 and data2) with the condition if the data on data3 is same the delete.
my expected result looks like this:
Data1 Data2 Data3
B XX BB
B YY CC
D XX EE
D YY FF
how to do it?
回答1:
Using groupby
+ transform
with nunique
yd=df[df.groupby(['Data1']).Data3.transform('nunique').gt(1)].copy()
Out[506]:
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
回答2:
You can also use a groupby
with nunique
, and a selection of rows:
>>> group = df.groupby('Data1')['Data3'].nunique()
>>> df[df['Data1'].isin(group[group.gt(1)].index)]
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
>>>
来源:https://stackoverflow.com/questions/56519823/drop-row-based-on-two-columns-conditions