Python, Merging rows with same value in one column

半世苍凉 提交于 2021-02-20 04:27:26

问题


My dataframe looks like this:

     ID         Class
      0           9
      1           8
      1           6
      2           6
      2           2
      3           15
      3           1
      3           8

What I would like to do is merging rows with same ID value in a way below:

    ID       Class1 Class2 Class3
    0           9
    1           8      6
    2           6      2
    3           15     1      8

So for each ID which exists more than once, I want to create new column(s) and move values from rows to those columns. What is the fastest way to do this? I tried using groupby but it didn't give me appriopate results.


回答1:


Use set_index with cumcount for new columns, reshape by unstack and last rename columns by add_prefix:

df = df.set_index(['ID', df.groupby('ID').cumcount()])['Class']
       .unstack()
       .add_prefix('Class')
       .reset_index()

print (df)
   ID  Class0  Class1  Class2
0   0     9.0     NaN     NaN
1   1     8.0     6.0     NaN
2   2     6.0     2.0     NaN
3   3    15.0     1.0     8.0

Another solution is create list per groups and then new DataFrame by constructor:

s = df.groupby('ID')['Class'].apply(list)
df = pd.DataFrame(s.values.tolist(), index=s.index)
       .add_prefix('Class')
       .reset_index()
print (df)
   ID  Class0  Class1  Class2
0   0       9     NaN     NaN
1   1       8     6.0     NaN
2   2       6     2.0     NaN
3   3      15     1.0     8.0

EDIT:

df = df.set_index('ID')
df1=pd.get_dummies(df['Class']).reindex(columns=range(17), fill_value=0).add_prefix('Class')
df1 = df1.groupby(level=0).max().reset_index()
print (df1)
   ID  Class0  Class1  Class2  Class3  Class4  Class5  Class6  Class7  Class8  \
0   0       0       0       0       0       0       0       0       0       0   
1   1       0       0       0       0       0       0       1       0       1   
2   2       0       0       1       0       0       0       1       0       0   
3   3       0       1       0       0       0       0       0       0       1   

   Class9  Class10  Class11  Class12  Class13  Class14  Class15  Class16  
0       1        0        0        0        0        0        0        0  
1       0        0        0        0        0        0        0        0  
2       0        0        0        0        0        0        0        0  
3       0        0        0        0        0        0        1        0  



回答2:


Or you can try

df.groupby('ID').Class.apply(lambda x : x.tolist()).to_frame()['Class'].apply(pd.Series).add_prefix('Class_').fillna(' ')
Out[602]: 
    Class_0 Class_1 Class_2
ID                         
0       9.0                
1       8.0       6        
2       6.0       2        
3      15.0       1       8


来源:https://stackoverflow.com/questions/45918559/python-merging-rows-with-same-value-in-one-column

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!