问题
I have a dataframe that looks like:
a A a B a C a D a E a F p A p B p C p D p E p F
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 1 0 0 0
3 0 0 1 0 0 1 0 0 0 0 0 0
4 0 0 0 1 0 1 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0 0 0 0 0
df = pd.DataFrame({'p A':[0,0,0,0,0,0,1],'p B':[0,0,0,0,0,0,0],'p C':[0,0,1,0,0,0,0],'p D':[0,0,0,0,0,0,0],'p E':[0,0,0,0,0,0,0],'p F':[0,0,0,0,0,0,0],'a A':[0,1,0,0,0,0,0],'a B':[0,0,1,0,0,0,0],'a C':[0,0,0,1,0,0,0],'a D':[0,0,0,0,1,0,0],'a E':[0,0,0,0,0,1,0],'a F': [0,0,0,1,1,0,0]})
Note: This is a much simplified version of my actual data.
a stands for Actual; p stands for Predicted; A - F represent a series of labels
I want to write a query that, for each row in my dataframe, returns True when: (all row values in "p columns" = 0 ) and (at least one row value in "a columns" = 1) i.e. for each row, p columns are fixed at 0 and at least 1 a column = 1.
Using answers to Pandas Dataframe Find Rows Where all Columns Equal and Compare two columns using pandas
I achieve this currently by using &
and np.any()
((df.iloc[:,6] == 0) & (df.iloc[:,7] == 0) & (df.iloc[:,8] == 0) & (df.iloc[:,9] == 0) & (df.iloc[:,10] == 0) & (df.iloc[:,11] == 0) & df.iloc[:,0:6].any(axis = 1) )
>>
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
Is there a more succinct, readable way I can achieve this?
回答1:
You can use ~
for invert boolean mask with iloc for select by position:
print (~df.iloc[:,6:11].any(1) & df.iloc[:,0:6].any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
Or use filter for select by column names, any for check at least one True
or all for check if all values are True
per row.
Function eq is for compare with 0
.
print (~df.filter(like='p').any(1) & df.filter(like='a').any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
print (df.filter(like='p').eq(0).all(1) & df.filter(like='a').any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
来源:https://stackoverflow.com/questions/42647710/compare-boolean-row-values-across-multiple-columns-in-pandas-using-np-where