I have a dataframe that looks like:
a A a B a C a D a E a F p A p B p C p D p E p F
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 1 0 0 0
3 0 0 1 0 0 1 0 0 0 0 0 0
4 0 0 0 1 0 1 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0 0 0 0 0
df = pd.DataFrame({'p A':[0,0,0,0,0,0,1],'p B':[0,0,0,0,0,0,0],'p C':[0,0,1,0,0,0,0],'p D':[0,0,0,0,0,0,0],'p E':[0,0,0,0,0,0,0],'p F':[0,0,0,0,0,0,0],'a A':[0,1,0,0,0,0,0],'a B':[0,0,1,0,0,0,0],'a C':[0,0,0,1,0,0,0],'a D':[0,0,0,0,1,0,0],'a E':[0,0,0,0,0,1,0],'a F': [0,0,0,1,1,0,0]})
Note: This is a much simplified version of my actual data.
a stands for Actual; p stands for Predicted; A - F represent a series of labels
I want to write a query that, for each row in my dataframe, returns True when: (all row values in "p columns" = 0 ) and (at least one row value in "a columns" = 1) i.e. for each row, p columns are fixed at 0 and at least 1 a column = 1.
Using answers to Pandas Dataframe Find Rows Where all Columns Equal and Compare two columns using pandas
I achieve this currently by using &
and np.any()
((df.iloc[:,6] == 0) & (df.iloc[:,7] == 0) & (df.iloc[:,8] == 0) & (df.iloc[:,9] == 0) & (df.iloc[:,10] == 0) & (df.iloc[:,11] == 0) & df.iloc[:,0:6].any(axis = 1) )
>>
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
Is there a more succinct, readable way I can achieve this?
You can use ~
for invert boolean mask with iloc
for select by position:
print (~df.iloc[:,6:11].any(1) & df.iloc[:,0:6].any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
Or use filter
for select by column names, any
for check at least one True
or all
for check if all values are True
per row.
Function eq
is for compare with 0
.
print (~df.filter(like='p').any(1) & df.filter(like='a').any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
print (df.filter(like='p').eq(0).all(1) & df.filter(like='a').any(1))
0 False
1 True
2 False
3 True
4 True
5 True
6 False
dtype: bool
来源:https://stackoverflow.com/questions/42647710/compare-boolean-row-values-across-multiple-columns-in-pandas-using-np-where