Python 2.7 - statsmodels - formatting and writing summary output

混江龙づ霸主 提交于 2019-12-03 02:54:10

There is no premade table of parameters and their result statistics currently available.

Essentially you need to stack all the results yourself, whether in a list, numpy array or pandas DataFrame depends on what's more convenient for you.

for example, if I want one numpy array that has the results for a model, llf and results in the summary parameter table, then I could use

res_all = []
for res in results:
    low, upp = res.confint().T   # unpack columns 
    res_all.append(numpy.concatenate(([res.llf], res.params, res.tvalues, res.pvalues, 
                   low, upp)))

But it might be better to align with pandas, depending on what structure you have across models.

You could write a helper function that takes all the results from the results instance and concatenates them in a row.

(I'm not sure what's the most convenient for writing to csv by rows)

edit:

Here is an example storing the regression results in a dataframe

https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/multilinear.py#L21

the loop is on line 159.

summary() and similar code outside of statsmodels, for example http://johnbeieler.org/py_apsrtable/ for combining several results, is oriented towards printing and not to store variables.

If you want to find coefficient results.params will give you coefficients. If you want to find pvalues then use results.pvalues. In any way you can use dir(results) to find out all the attribute of a object.

I found this formulation to be a little more straightforward. You can add/subtract columns by following the syntax from the examples (pvals,coeff,conf_lower,conf_higher).

import pandas as pd     #This can be left out if already present...

def results_summary_to_dataframe(results):
    '''This takes the result of an statsmodel results table and transforms it into a dataframe'''
    pvals = results.pvalues
    coeff = results.params
    conf_lower = results.conf_int()[0]
    conf_higher = results.conf_int()[1]

    results_df = pd.DataFrame({"pvals":pvals,
                               "coeff":coeff,
                               "conf_lower":conf_lower,
                               "conf_higher":conf_higher
                                })

    #Reordering...
    results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]
    return results_df
write_path = '/my/path/here/output.csv'
with open(write_path, 'w') as f:
    f.write(result.summary().as_csv())

There is actually a built-in method documented in the documentation here:

f = open('csvfile.csv','w')
f.write(result.summary().as_csv())
f.close()

I believe this is a much easier (and clean) way to output the summaries to csv files.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!