Splitting multiple columns into rows in pandas dataframe

后端 未结 5 1734
独厮守ぢ
独厮守ぢ 2020-12-03 09:16

I have a pandas dataframe as follows:

ticker    account      value         date
aa       assets       100,200       20121231, 20131231
bb       liabilities           


        
5条回答
  •  無奈伤痛
    2020-12-03 09:37

    I wrote explode function based on previous answers. It might be useful for anyone who want to grab and use it quickly.

    def explode(df, cols, split_on=','):
        """
        Explode dataframe on the given column, split on given delimeter
        """
        cols_sep = list(set(df.columns) - set(cols))
        df_cols = df[cols_sep]
        explode_len = df[cols[0]].str.split(split_on).map(len)
        repeat_list = []
        for r, e in zip(df_cols.as_matrix(), explode_len):
            repeat_list.extend([list(r)]*e)
        df_repeat = pd.DataFrame(repeat_list, columns=cols_sep)
        df_explode = pd.concat([df[col].str.split(split_on, expand=True).stack().str.strip().reset_index(drop=True)
                                for col in cols], axis=1)
        df_explode.columns = cols
        return pd.concat((df_repeat, df_explode), axis=1)
    

    example given from @piRSquared:

    df = pd.DataFrame([['aa', 'assets', '100,200', '20121231,20131231'],
                       ['bb', 'liabilities', '50,50', '20141231,20131231']],
                      columns=['ticker', 'account', 'value', 'date'])
    explode(df, ['value', 'date'])
    

    output

    +-----------+------+-----+--------+
    |    account|ticker|value|    date|
    +-----------+------+-----+--------+
    |     assets|    aa|  100|20121231|
    |     assets|    aa|  200|20131231|
    |liabilities|    bb|   50|20141231|
    |liabilities|    bb|   50|20131231|
    +-----------+------+-----+--------+
    

提交回复
热议问题