Pandas Dataframe Multiindex Merge

前端 未结 4 426
萌比男神i
萌比男神i 2020-12-16 14:31

I wanted to ask a questions regarding merging multiindex dataframe in pandas, here is a hypothetical scenario:

arrays = [[\'bar\', \'bar\', \'baz\', \'baz\'         


        
相关标签:
4条回答
  • 2020-12-16 15:01

    rename_axis

    You can rename the index levels of one and let join do its thing

    s1.join(s2.rename_axis(s1.index.names))
    
                        s1        s2
    first second                    
    bar   one    -0.696420 -1.040463
          two     0.640891  1.483262
    baz   one     1.598837  0.097424
          two     0.003994 -0.948419
    foo   one    -0.717401  1.190019
          two    -1.201237 -0.000738
    qux   one     0.559684 -0.505640
          two     1.979700  0.186013
    

    concat

    pd.concat([s1, s2], axis=1)
    
                        s1        s2
    first second                    
    bar   one    -0.696420 -1.040463
          two     0.640891  1.483262
    baz   one     1.598837  0.097424
          two     0.003994 -0.948419
    foo   one    -0.717401  1.190019
          two    -1.201237 -0.000738
    qux   one     0.559684 -0.505640
          two     1.979700  0.186013
    
    0 讨论(0)
  • 2020-12-16 15:09

    Seems like you need to use a combination of them.

    s1.merge(s2, left_index=True, right_on=['third', 'fourth'])
    #s1.merge(s2, right_index=True, left_on=['first', 'second'])
    

    Output:

                   s1        s2
    bar one  0.765385 -0.365508
        two  1.462860  0.751862
    baz one  0.304163  0.761663
        two -0.816658 -1.810634
    foo one  1.891434  1.450081
        two  0.571294  1.116862
    qux one  1.056516 -0.052927
        two -0.574916 -1.197596
    
    0 讨论(0)
  • 2020-12-16 15:13

    Assign it by combine_first

    s1.combine_first(s2)
    Out[19]: 
                        s1        s2
    first second                    
    bar   one     0.039203  0.795963
          two     0.454782 -0.222806
    baz   one     3.101120 -0.645474
          two    -1.174929 -0.875561
    foo   one    -0.887226  1.078218
          two     1.507546 -1.078564
    qux   one     0.028048  0.042462
          two     0.826544 -0.375351
    
    # s2.combine_first(s1)
    
    0 讨论(0)
  • 2020-12-16 15:23

    Other than using the indexes names as pointed by @ALollz, you can simply use loc, which will match indexes automatically

    s1.loc[:, 's2'] = s2   # Or explicitly, s2['s2']
    
                    s1           s2
    first   second      
    bar     one     -0.111384   -2.341803
            two     -1.226569    1.308240
    baz     one      1.880835    0.697946
            two     -0.008979   -0.247896
    foo     one      0.103864   -1.039990
            two      0.836931    0.000811
    qux     one     -0.859005   -1.199615
            two     -0.321341   -1.098691
    

    A general formula would be

    s1.loc[:, s2.columns] = s2
    
    0 讨论(0)
提交回复
热议问题