Using polyfit on pandas dataframe and then adding the results to new columns

本秂侑毒 提交于 2019-12-10 11:54:29

问题


I have a dataframe like this. For each Id, I have (x1,x2), (y1,y2). I want to supply these to polyfit(), get the slope and the x-intercept and add them as new columns.

    Id        x         y
    1     0.79978   0.018255
    1     1.19983   0.020963
    2     2.39998   0.029006
    2     2.79995   0.033004
    3     1.79965   0.021489
    3     2.19969   0.024194
    4     1.19981   0.019338
    4     1.59981   0.022200
    5     1.79971   0.025629
    5     2.19974   0.028681

I really need help with grouping the correct rows and supplying them to polyfit. I have been struggling with this. Any help would be most welcome.


回答1:


You can groupby and apply the fit within each group. First, set the index so you can avoid a merge later.

import pandas as pd
import numpy as np

df = df.set_index('Id')
df['fit'] = df.groupby('Id').apply(lambda x: np.polyfit(x.x, x.y, 1))

df is now:

          x         y                                           fit
Id                                                                 
1   0.79978  0.018255  [0.0067691538557680215, 0.01284116612923385]
1   1.19983  0.020963  [0.0067691538557680215, 0.01284116612923385]
2   2.39998  0.029006   [0.00999574968122608, 0.005016400680051043]
2   2.79995  0.033004   [0.00999574968122608, 0.005016400680051043]
3   1.79965  0.021489  [0.006761823817618233, 0.009320083766623343]
3   2.19969  0.024194  [0.006761823817618233, 0.009320083766623343]
...

If you want separate columns for each part separately, you can apply pd.Series

df[['slope', 'intercept']] = df.fit.apply(pd.Series)
df = df.drop(columns='fit').reset_index()

df is now:

   Id        x         y     slope  intercept
0   1  0.79978  0.018255  0.006769   0.012841
1   1  1.19983  0.020963  0.006769   0.012841
2   2  2.39998  0.029006  0.009996   0.005016
3   2  2.79995  0.033004  0.009996   0.005016
4   3  1.79965  0.021489  0.006762   0.009320
5   3  2.19969  0.024194  0.006762   0.009320
6   4  1.19981  0.019338  0.007155   0.010753
7   4  1.59981  0.022200  0.007155   0.010753
8   5  1.79971  0.025629  0.007629   0.011898
9   5  2.19974  0.028681  0.007629   0.011898


来源:https://stackoverflow.com/questions/51140302/using-polyfit-on-pandas-dataframe-and-then-adding-the-results-to-new-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!