How to concatenate multiple column values into a single column in Panda dataframe

后端 未结 11 1731
梦谈多话
梦谈多话 2020-12-02 14:25

This question is same to this posted earlier. I want to concatenate three columns instead of concatenating two columns:

Here is the combining two columns:

         


        
11条回答
  •  没有蜡笔的小新
    2020-12-02 15:06

    The answer given by @allen is reasonably generic but can lack in performance for larger dataframes:

    Reduce does a lot better:

    from functools import reduce
    
    import pandas as pd
    
    # make data
    df = pd.DataFrame(index=range(1_000_000))
    df['1'] = 'CO'
    df['2'] = 'BOB'
    df['3'] = '01'
    df['4'] = 'BILL'
    
    
    def reduce_join(df, columns):
        assert len(columns) > 1
        slist = [df[x].astype(str) for x in columns]
        return reduce(lambda x, y: x + '_' + y, slist[1:], slist[0])
    
    
    def apply_join(df, columns):
        assert len(columns) > 1
        return df[columns].apply(lambda row:'_'.join(row.values.astype(str)), axis=1)
    
    # ensure outputs are equal
    df1 = reduce_join(df, list('1234'))
    df2 = apply_join(df, list('1234'))
    assert df1.equals(df2)
    
    # profile
    %timeit df1 = reduce_join(df, list('1234'))  # 733 ms
    %timeit df2 = apply_join(df, list('1234'))   # 8.84 s
    
    

提交回复
热议问题