selecting from multi-index pandas

后端 未结 6 475
庸人自扰
庸人自扰 2020-12-02 05:19

I have a multi-index data frame with columns \'A\' and \'B\'.

Is there is a way to select rows by filtering on one column of the multi-index without resetting the

6条回答
  •  执念已碎
    2020-12-02 05:38

    Understanding how to access multi-indexed pandas DataFrame can help you with all kinds of task like that.

    Copy paste this in your code to generate example:

    # hierarchical indices and columns
    index = pd.MultiIndex.from_product([[2013, 2014], [1, 2]],
                                       names=['year', 'visit'])
    columns = pd.MultiIndex.from_product([['Bob', 'Guido', 'Sue'], ['HR', 'Temp']],
                                         names=['subject', 'type'])
    
    # mock some data
    data = np.round(np.random.randn(4, 6), 1)
    data[:, ::2] *= 10
    data += 37
    
    # create the DataFrame
    health_data = pd.DataFrame(data, index=index, columns=columns)
    health_data
    

    Will give you table like this:

    Standard access by column

    health_data['Bob']
    type       HR   Temp
    year visit      
    2013    1   22.0    38.6
            2   52.0    38.3
    2014    1   30.0    38.9
            2   31.0    37.3
    
    
    health_data['Bob']['HR']
    year  visit
    2013  1        22.0
          2        52.0
    2014  1        30.0
          2        31.0
    Name: HR, dtype: float64
    
    # filtering by column/subcolumn - your case:
    health_data['Bob']['HR']==22
    year  visit
    2013  1         True
          2        False
    2014  1        False
          2        False
    
    health_data['Bob']['HR'][2013]    
    visit
    1    22.0
    2    52.0
    Name: HR, dtype: float64
    
    health_data['Bob']['HR'][2013][1]
    22.0
    

    Access by row

    health_data.loc[2013]
    subject Bob Guido   Sue
    type    HR  Temp    HR  Temp    HR  Temp
    visit                       
    1   22.0    38.6    40.0    38.9    53.0    37.5
    2   52.0    38.3    42.0    34.6    30.0    37.7
    
    health_data.loc[2013,1] 
    subject  type
    Bob      HR      22.0
             Temp    38.6
    Guido    HR      40.0
             Temp    38.9
    Sue      HR      53.0
             Temp    37.5
    Name: (2013, 1), dtype: float64
    
    health_data.loc[2013,1]['Bob']
    type
    HR      22.0
    Temp    38.6
    Name: (2013, 1), dtype: float64
    
    health_data.loc[2013,1]['Bob']['HR']
    22.0
    

    Slicing multi-index

    idx=pd.IndexSlice
    health_data.loc[idx[:,1], idx[:,'HR']]
        subject Bob Guido   Sue
    type    HR  HR  HR
    year    visit           
    2013    1   22.0    40.0    53.0
    2014    1   30.0    52.0    45.0
    

提交回复
热议问题