Converting Index into MultiIndex (hierarchical index) in Pandas

后端 未结 3 1278
广开言路
广开言路 2021-02-04 08:49

In the data I am working with the index is compound - i.e. it has both item name and a timestamp, e.g. name@domain.com|2013-05-07 05:52:51 +0200.

I want to

3条回答
  •  青春惊慌失措
    2021-02-04 08:52

    Once we have a DataFrame

    import pandas as pd
    df = pd.read_csv("input.csv", index_col=0)  # or from another source
    

    and a function mapping each index to a tuple (below, it is for the example from this question)

    def process_index(k):
        return tuple(k.split("|"))
    

    we can create a hierarchical index in the following way:

    df.index = pd.MultiIndex.from_tuples([process_index(k) for k,v in df.iterrows()])
    

    An alternative approach is to create two columns then set them as the index (the original index will be dropped):

    df['e-mail'] = [x.split("|")[0] for x in df.index] 
    df['date'] = [x.split("|")[1] for x in df.index]
    df = df.set_index(['e-mail', 'date'])
    

    or even shorter

    df['e-mail'], df['date'] = zip(*map(process_index, df.index))
    df = df.set_index(['e-mail', 'date'])
    

提交回复
热议问题