Transposing a column in a pandas dataframe while keeping other column intact with duplicates

和自甴很熟 提交于 2019-11-26 10:02:01

问题


My data frame is as follows

selection_id  last_traded_price
430494        1.46
430494        1.48
430494        1.56
430494        1.57
430495        2.45
430495        2.67
430495        2.72
430495        2.87

I have lots of rows that contain selection id\'s and I need to keep selection_id column the same but transpose the data in last traded price to look like this.

selection_id  last_traded_price
430494        1.46              1.48          1.56      1.57    e.t.c 
430495        2.45              2.67          2.72      2.87    e.t.c

I\'ve tried a to use a pivot

   (df.pivot(index=\'selection_id\', columns=last_traded_price\', values=\'last_traded_price\')

Pivot isn\'t working due to duplicate rows in selection_id. is it possible to transpose the data first and drop the duplicates after?


回答1:


Option 1
groupby + apply

v = df.groupby('selection_id').last_traded_price.apply(list)
pd.DataFrame(v.tolist(), index=v.index)

                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87

Option 2
You can do this with pivot, as long as you have another column of counts to pass for the pivoting (it needs to be pivoted along something, that's why).

df['Count'] = df.groupby('selection_id').cumcount()
df.pivot('selection_id', 'Count', 'last_traded_price')

Count            0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87



回答2:


You can use cumcount for Counter for new columns names created by set_index + unstack or pandas.pivot:

g = df.groupby('selection_id').cumcount()
df = df.set_index(['selection_id',g])['last_traded_price'].unstack()
print (df)
                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87

Similar solution with pivot:

df = pd.pivot(index=df['selection_id'], 
              columns=df.groupby('selection_id').cumcount(), 
              values=df['last_traded_price'])
print (df)
                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87


来源:https://stackoverflow.com/questions/48338381/transposing-a-column-in-a-pandas-dataframe-while-keeping-other-column-intact-wit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!