How to apply OLS from statsmodels to groupby

情到浓时终转凉″ 提交于 2019-12-01 12:08:39

You could do something like this ...

import pandas as pd
import statsmodels.api as sm

for products in linear_regression_df.product_desc.unique():
    tempdf = linear_regression_df[linear_regression_df.product_desc == products]
    X = tempdf['period_num']
    y = tempdf['TOTALS']

    model = sm.OLS(y, X)
    results = model.fit()

    print results.params #  Or whatever summary info you want

Use get_group to get each individual group and perform OLS model on each one:

for group in linear_regression_grouped.groups.keys():
    df= linear_regression_grouped.get_group(group)
    X = df['period_num'] 
    y = df['TOTALS']
    model = sm.OLS(y, X)
    results = model.fit()
    print results.summary()

But in real case, you also want to have the intercept term so the model should be defined slightly differently:

for group in linear_regression_grouped.groups.keys():
    df= linear_regression_grouped.get_group(group)
    df['constant']=1
    X = df[['period_num','constant']]
    y = df['TOTALS']
    model = sm.OLS(y,X)
    results = model.fit()
    print results.summary()

The results (with intercept and without) are, certainly, very different.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!