Pandas Melt with Multiple Value Vars

后端 未结 3 1641
南笙
南笙 2020-12-30 14:52

I have a data set which is in wide format like this

   Index Country     Variable 2000 2001 2002 2003 2004 2005
   0     Argentina   var1     12   15   18            


        
3条回答
  •  轮回少年
    2020-12-30 15:16

    Option 1

    Using melt then unstack for var1, var2, etc...

    (df1.melt(id_vars=['Country','Variable'],var_name='Year')
        .set_index(['Country','Year','Variable'])
        .squeeze()
        .unstack()
        .reset_index())
    

    Output:

    Variable    Country  Year  var1  var2
    0         Argentina  2000    12     1
    1         Argentina  2001    15     3
    2         Argentina  2002    18     2
    3         Argentina  2003    17     5
    4         Argentina  2004    23     7
    5         Argentina  2005    29     5
    6            Brazil  2000    20     0
    7            Brazil  2001    23     1
    8            Brazil  2002    25     2
    9            Brazil  2003    29     2
    10           Brazil  2004    31     3
    11           Brazil  2005    32     3
    

    Option 2

    Using pivot then stack:

    (df1.pivot(index='Country',columns='Variable')
       .stack(0)
       .rename_axis(['Country','Year'])
       .reset_index())
    

    Output:

    Variable    Country  Year  var1  var2
    0         Argentina  2000    12     1
    1         Argentina  2001    15     3
    2         Argentina  2002    18     2
    3         Argentina  2003    17     5
    4         Argentina  2004    23     7
    5         Argentina  2005    29     5
    6            Brazil  2000    20     0
    7            Brazil  2001    23     1
    8            Brazil  2002    25     2
    9            Brazil  2003    29     2
    10           Brazil  2004    31     3
    11           Brazil  2005    32     3
    

    Option 3 (ayhan's solution)

    Using set_index, stack, and unstack:

    (df.set_index(['Country', 'Variable'])
       .rename_axis(['Year'], axis=1)
       .stack()
       .unstack('Variable')
       .reset_index())
    

    Output:

    Variable    Country  Year  var1  var2
    0         Argentina  2000    12     1
    1         Argentina  2001    15     3
    2         Argentina  2002    18     2
    3         Argentina  2003    17     5
    4         Argentina  2004    23     7
    5         Argentina  2005    29     5
    6            Brazil  2000    20     0
    7            Brazil  2001    23     1
    8            Brazil  2002    25     2
    9            Brazil  2003    29     2
    10           Brazil  2004    31     3
    11           Brazil  2005    32     3
    

提交回复
热议问题