Modify Value of Pandas dataframe Groups

最后都变了- 提交于 2019-12-24 09:39:11

问题


We have the following dataframe (df) that has 3 columns. The goal is to make sure that the summation of "Load" for each group based on IDs is equal to 1.

pd.DataFrame({'ID':['AEC','AEC','CIZ','CIZ','CIZ'],'Load':[0.2093275,0.5384086,0.1465657,0.7465657,0.1465657]})

Num   ID  Load
1   AEC 0.2093275
2   AEC 0.5384086
3   CIZ 0.1465657
4   CIZ 0.7465657
5   CIZ 0.1465657

If a group's total load is less or more than 1, we want to add or subtract from only one member of the group to make the summation equal 1 without adding extra rows to the dataframe (just by modifying the values). How can we do that?

Thank you all in advance.


回答1:


I am using resample random pick one value from each group to make the change

df['New']=(1-df.groupby('ID').Load.transform('sum'))

df['Load']=df.Load.add(df.groupby('ID').New.apply(lambda x : x.sample(1)).reset_index('ID',drop=True)).fillna(df.Load)

df.drop('New',1)
Out[163]: 
   Num   ID      Load
0    1  AEC  0.209327
1    2  AEC  0.790673
2    3  CIZ  0.146566
3    4  CIZ  0.746566
4    5  CIZ  0.106869

Check

df.drop('New',1).groupby('ID').Load.sum()
Out[164]: 
ID
AEC    1.0
CIZ    1.0
Name: Load, dtype: float64



回答2:


You can use drop_duplicates to keep the first record in each group and then change the Load value so that its group Load sum is 1.

df.loc[df.ID.drop_duplicates().index, 'Load'] -= df.groupby('ID').Load.sum().subtract(1).values

df
Out[92]: 
   Num   ID      Load
0    1  AEC  0.461591
1    2  AEC  0.538409
2    3  CIZ  0.106869
3    4  CIZ  0.746566
4    5  CIZ  0.146566

df.groupby('ID').Load.sum()
Out[93]: 
ID
AEC    1.0
CIZ    1.0
Name: Load, dtype: float64


来源:https://stackoverflow.com/questions/48533538/modify-value-of-pandas-dataframe-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!