I am trying to filter a df using several Boolean variables that are a part of the df, but have been unable to do so.
Sample data:
A | B | C | D
John
In [82]: d
Out[82]:
A B C D
0 John Doe 45 True False
1 Jane Smith 32 False False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
Solution 1:
In [83]: d.loc[d.C | d.D]
Out[83]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
Solution 2:
In [94]: d[d[['C','D']].any(1)]
Out[94]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
Solution 3:
In [95]: d.query("C or D")
Out[95]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
PS If you change your solution to:
df[(df['C']==True) | (df['D']==True)]
it'll work too
Pandas docs - boolean indexing
why we should NOT use "PEP complaint" df["col_name"] is True instead of df["col_name"] == True?
In [11]: df = pd.DataFrame({"col":[True, True, True]})
In [12]: df
Out[12]:
col
0 True
1 True
2 True
In [13]: df["col"] is True
Out[13]: False # <----- oops, that's not exactly what we wanted