Filtering pandas dataframe with multiple Boolean columns

前端 未结 5 2025
闹比i
闹比i 2020-12-05 07:26

I am trying to filter a df using several Boolean variables that are a part of the df, but have been unable to do so.

Sample data:

A | B | C | D
John         


        
5条回答
  •  悲&欢浪女
    2020-12-05 08:16

    In [82]: d
    Out[82]:
                 A   B      C      D
    0     John Doe  45   True  False
    1   Jane Smith  32  False  False
    2  Alan Holmes  55  False   True
    3   Eric Lamar  29   True   True
    

    Solution 1:

    In [83]: d.loc[d.C | d.D]
    Out[83]:
                 A   B      C      D
    0     John Doe  45   True  False
    2  Alan Holmes  55  False   True
    3   Eric Lamar  29   True   True
    

    Solution 2:

    In [94]: d[d[['C','D']].any(1)]
    Out[94]:
                 A   B      C      D
    0     John Doe  45   True  False
    2  Alan Holmes  55  False   True
    3   Eric Lamar  29   True   True
    

    Solution 3:

    In [95]: d.query("C or D")
    Out[95]:
                 A   B      C      D
    0     John Doe  45   True  False
    2  Alan Holmes  55  False   True
    3   Eric Lamar  29   True   True
    

    PS If you change your solution to:

    df[(df['C']==True) | (df['D']==True)]
    

    it'll work too

    Pandas docs - boolean indexing


    why we should NOT use "PEP complaint" df["col_name"] is True instead of df["col_name"] == True?

    In [11]: df = pd.DataFrame({"col":[True, True, True]})
    
    In [12]: df
    Out[12]:
        col
    0  True
    1  True
    2  True
    
    In [13]: df["col"] is True
    Out[13]: False               # <----- oops, that's not exactly what we wanted
    

提交回复
热议问题