Concatenate specific pairs of columns in a dataframe based on a reference dataframe with a different index

前端 未结 2 1887
不思量自难忘°
不思量自难忘° 2021-01-14 18:24

My goal is to concatenate columns in a dataframe(Source), based on pairs that are described in a separate dataframe(Reference). The resulting dataframe should replace the c

2条回答
  •  情书的邮戳
    2021-01-14 18:31

    There will likely be better solutions, but at least this one is working:

    import pandas as pd
    
    df1 = pd.DataFrame({'FIRST': pd.Series(['Alpha', 'Alpha', 'Charlie'],
                                           index=['H1', 'H2',  'H3']),
                        'SECOND': pd.Series(['Bravo', 'Delta', 'Delta'],
                                            index=['H1', 'H2', 'H3'])})
    
    df2 = pd.DataFrame({'Alpha' : pd.Series(['A', 'C'], index = ['item-000', 'item-111']),
                        'Bravo' : pd.Series(['A', 'C'], index = ['item-000', 'item-111']),
                        'Delta' : pd.Series(['T', 'C'], index = ['item-000', 'item-111']),
                        'Charlie' : pd.Series(['T', 'G'], index = ['item-000', 'item-111'])})
    
    pd.concat((df1.T.apply(lambda x: x.map(df2.loc[idx]).str.cat())
               for idx in df2.index),
              axis=1).rename_axis(pd.Series(df2.index), axis=1).T
    
    Out[]:
              H1  H2  H3
    item-000  AA  AT  TT
    item-111  CC  CC  GC
    

    Of course this relies on both a for loop in the iterator and an apply, so it will not be very efficient.

提交回复
热议问题