How to concatenate multiple column values into a single column in Panda dataframe

后端 未结 11 1656
梦谈多话
梦谈多话 2020-12-02 14:25

This question is same to this posted earlier. I want to concatenate three columns instead of concatenating two columns:

Here is the combining two columns:

         


        
11条回答
  •  情深已故
    2020-12-02 14:55

    Possibly the fastest solution is to operate in plain Python:

    Series(
        map(
            '_'.join,
            df.values.tolist()
            # when non-string columns are present:
            # df.values.astype(str).tolist()
        ),
        index=df.index
    )
    

    Comparison against @MaxU answer (using the big data frame which has both numeric and string columns):

    %timeit big['bar'].astype(str) + '_' + big['foo'] + '_' + big['new']
    # 29.4 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    
    %timeit Series(map('_'.join, big.values.astype(str).tolist()), index=big.index)
    # 27.4 ms ± 2.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

    Comparison against @derchambers answer (using their df data frame where all columns are strings):

    from functools import reduce
    
    def reduce_join(df, columns):
        slist = [df[x] for x in columns]
        return reduce(lambda x, y: x + '_' + y, slist[1:], slist[0])
    
    def list_map(df, columns):
        return Series(
            map(
                '_'.join,
                df[columns].values.tolist()
            ),
            index=df.index
        )
    
    %timeit df1 = reduce_join(df, list('1234'))
    # 602 ms ± 39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %timeit df2 = list_map(df, list('1234'))
    # 351 ms ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

提交回复
热议问题