Filtering multiple items in a multi-index Python Panda dataframe

后端 未结 4 2016
轮回少年
轮回少年 2020-12-13 17:29

I have the following table:

Note: Both NSRCODE and PBL_AWI are index\'s

Note: the % Of area column would be filled out just have not done so yet.

<         


        
相关标签:
4条回答
  • 2020-12-13 17:44

    You can get_level_values in conjunction with Boolean slicing.

    In [50]:
    
    print df[np.in1d(df.index.get_level_values(1), ['Lake', 'River', 'Upland'])]
                              Area
    NSRCODE PBL_AWI               
    CM      Lake      57124.819333
            River      1603.906642
    LBH     Lake     258046.508310
            River     44262.807900
    

    The same idea can be expressed in many different ways, such as df[df.index.get_level_values('PBL_AWI').isin(['Lake', 'River', 'Upland'])]

    Note that you have 'upland' in your data instead of 'Upland'

    0 讨论(0)
  • 2020-12-13 17:45

    This is an answer to a slight variant of the question asked that might save someone else a little time. If you are looking for a wildcard type match to a label whose exact value you don't know, you can use something like this:

    q_labels = [ label for label in df.index.levels[1] if label.startswith('Q') ]
    new_df = df[ df.index.isin(q_labels, level=1) ]
    
    0 讨论(0)
  • 2020-12-13 17:47

    Also (from here):

    def filter_by(df, constraints):
        """Filter MultiIndex by sublevels."""
        indexer = [constraints[name] if name in constraints else slice(None)
                   for name in df.index.names]
        return df.loc[tuple(indexer)] if len(df.shape) == 1 else df.loc[tuple(indexer),]
    
    pd.Series.filter_by = filter_by
    pd.DataFrame.filter_by = filter_by
    

    ... to be used as

    df.filter_by({'PBL_AWI' : ['Lake', 'River', 'Upland']})
    

    (untested with Panels and higher dimension elements, but I do expect it to work)

    0 讨论(0)
  • 2020-12-13 17:49

    Another (maybe cleaner) way might be this one:

    print(df[df.index.isin(['Lake', 'River', 'Upland'], level=1)])
    

    The parameter level specifies the index number (starting with 0) or index name (here: level='PBL_AWI')

    0 讨论(0)
提交回复
热议问题