Filtering pandas dataframe with multiple Boolean columns

前端 未结 5 2021
闹比i
闹比i 2020-12-05 07:26

I am trying to filter a df using several Boolean variables that are a part of the df, but have been unable to do so.

Sample data:

A | B | C | D
John         


        
5条回答
  •  一向
    一向 (楼主)
    2020-12-05 08:25

    So, the easiest way to do this:

    students = [ ('jack1', 'Apples1' , 341) ,
                 ('Riti1', 'Mangos1'  , 311) ,
                 ('Aadi1', 'Grapes1' , 301) ,
                 ('Sonia1', 'Apples1', 321) ,
                 ('Lucy1', 'Mangos1'  , 331) ,
                 ('Mike1', 'Apples1' , 351),
                  ('Mik', 'Apples1' , np.nan)
                  ]
    #Create a DataFrame object
    df = pd.DataFrame(students, columns = ['Name1' , 'Product1', 'Sale1']) 
    print(df)
    
    
        Name1 Product1  Sale1
    0   jack1  Apples1    341
    1   Riti1  Mangos1    311
    2   Aadi1  Grapes1    301
    3  Sonia1  Apples1    321
    4   Lucy1  Mangos1    331
    5   Mike1  Apples1    351
    6     Mik  Apples1    NaN
    
    # Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’,
    subset = df[df['Product1'] == 'Apples1']
    print(subset)
    
     Name1 Product1  Sale1
    0   jack1  Apples1    341
    3  Sonia1  Apples1    321
    5   Mike1  Apples1    351
    6     Mik  Apples1    NA
    
    # Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’, AND notnull value in Sale
    
    subsetx= df[(df['Product1'] == "Apples1")  & (df['Sale1'].notnull())]
    print(subsetx)
        Name1   Product1    Sale1
    0   jack1   Apples1      341
    3   Sonia1  Apples1      321
    5   Mike1   Apples1      351
    
    # Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’, AND Sale = 351
    
    subsetx= df[(df['Product1'] == "Apples1")  & (df['Sale1'] == 351)]
    print(subsetx)
    
       Name1 Product1  Sale1
    5  Mike1  Apples1    351
    
    # Another example
    subsetData = df[df['Product1'].isin(['Mangos1', 'Grapes1']) ]
    print(subsetData)
    
    Name1 Product1  Sale1
    1  Riti1  Mangos1    311
    2  Aadi1  Grapes1    301
    4  Lucy1  Mangos1    331
    
    

    Here is the source of this code: https://thispointer.com/python-pandas-select-rows-in-dataframe-by-conditions-on-multiple-columns/
    I added minor changes to it.

提交回复
热议问题