Sorting Multi-Index to full depth (Pandas)

前端 未结 3 1903
暖寄归人
暖寄归人 2021-02-07 08:19

I have a dataframe which Im loading from a csv file and then setting the index to few of its columns (usually two or three) by the set_index method. The idea is to

3条回答
  •  不要未来只要你来
    2021-02-07 08:52

    Its not really clear what you are asking. Multi-index docs are here

    The OP needs to set the index, then sort in place

    df.set_index(['fileName','phrase'],inplace=True)
    df.sortlevel(inplace=True)
    

    Then access these levels via a tuple to get a specific result

    df.ix[('somePath','somePhrase')]
    

    Maybe just give a toy example like this and show I want to get a specific result.

    In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'])
       ...:    .....: ,
       ...:    .....:           np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])
       ...:    .....:           ]
    
    In [2]: df = DataFrame(randn(8, 4), index=arrays)
    
    In [3]: df
    Out[3]: 
                    0         1         2         3
    bar one  1.654436  0.184326 -2.337694  0.625120
        two  0.308995  1.219156 -0.906315  1.555925
    baz one -0.180826 -1.951569  1.617950 -1.401658
        two  0.399151 -1.305852  1.530370 -0.132802
    foo one  1.097562  0.097126  0.387418  0.106769
        two  0.465681  0.270120 -0.387639 -0.142705
    qux one -0.656487 -0.154881  0.495044 -1.380583
        two  0.274045 -0.070566  1.274355  1.172247
    
    In [4]: df.index.lexsort_depth
    Out[4]: 2
    
    In [5]: df.ix[('foo','one')]
    Out[5]: 
    0    1.097562
    1    0.097126
    2    0.387418
    3    0.106769
    Name: (foo, one), dtype: float64
    
    In [6]: df.ix['foo']
    Out[6]: 
                0         1         2         3
    one  1.097562  0.097126  0.387418  0.106769
    two  0.465681  0.270120 -0.387639 -0.142705
    
    In [7]: df.ix[['foo']]
    Out[7]: 
                    0         1         2         3
    foo one  1.097562  0.097126  0.387418  0.106769
        two  0.465681  0.270120 -0.387639 -0.142705
    
    In [8]: df.sortlevel(level=1)
    Out[8]: 
                    0         1         2         3
    bar one  1.654436  0.184326 -2.337694  0.625120
    baz one -0.180826 -1.951569  1.617950 -1.401658
    foo one  1.097562  0.097126  0.387418  0.106769
    qux one -0.656487 -0.154881  0.495044 -1.380583
    bar two  0.308995  1.219156 -0.906315  1.555925
    baz two  0.399151 -1.305852  1.530370 -0.132802
    foo two  0.465681  0.270120 -0.387639 -0.142705
    qux two  0.274045 -0.070566  1.274355  1.172247
    
    In [10]: df.sortlevel(level=1).index.lexsort_depth
    Out[10]: 0
    

提交回复
热议问题