Pandas Dataframe AttributeError: 'DataFrame' object has no attribute 'design_info'

╄→尐↘猪︶ㄣ 提交于 2019-12-05 02:36:48

Pickling and unpickling of a pandas DataFrame doesn't save and restore attributes that have been attached by a user, as far as I know.

Since the formula information is currently stored together with the DataFrame of the original design matrix, this information is lost after unpickling a Results and Model instance.

If you don't use categorical variables and transformations, then the correct designmatrix can be built with patsy.dmatrix. I think the following should work

x = patsy.dmatrix("B + C", data=df)  # df is data for prediction
test2 = model.predict(x, transform=False)

or constructing the design matrix for the prediction directly should also work Note we need to explicitly add a constant that the formula adds by default.

from statsmodels.api import add_constant
test2 = model.predict(add_constant(df[["B", "C"]]), transform=False)

If the formula and design matrix contain (stateful) transformation and categorical variables, then it's not possible to conveniently construct the design matrix without the original formula information. Constructing it by hand and doing all the calculations explicitly is difficult in this case, and looses all the advantages of using formulas.

The only real solution is to pickle the formula information design_info independently of the dataframe orig_exog.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!