Pandas groupby and correct with median in new column

扶醉桌前 提交于 2021-01-29 03:09:48

问题


My dataframe look like this

Plate Sample LogRatio
 P1     S1     0.42
 P1     S2     0.23 
 P2     S3     0.41 
 P3     S4     0.36 
 P3     S5     0.18

I have calculated the median of each plate (but it's probably not the best idea to start like this)

grouped = df.groupby("Plate")
medianesPlate = grouped["LogRatio"].median() 

And I want to add a column on my dataframe

CorrectedLogRatio = LogRatio-median(plate)

I suppose with :

df["CorrectedLogRatio"] = LogRatio-median(plate)

To have something like this :

Plate Sample LogRatio CorrectedLogRatio
 P1     S1     0.42    0.42-median(P1)   
 P1     S2     0.23    0.23-median(P1)
 P2     S3     0.41    0.41-median(P2)
 P3     S4     0.36    0.36-median(P3)
 P3     S5     0.18    0.18-median(P3)

But I don't know how to get the median from medianesPlates. I tried some apply and transform functions but it doesn't work. Thanks for any help


回答1:


You can use transform:

df['CorrectedLogRatio'] = df['LogRatio'] - df.groupby('Plate')['LogRatio'].transform('median')

The resulting output:

  Plate Sample  LogRatio  CorrectedLogRatio
0    P1     S1      0.42              0.095
1    P1     S2      0.23             -0.095
2    P2     S3      0.41              0.000
3    P3     S4      0.36              0.090
4    P3     S5      0.18             -0.090


来源:https://stackoverflow.com/questions/40532303/pandas-groupby-and-correct-with-median-in-new-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!