Building multi-regression model throws error: `Pandas data cast to numpy dtype of object. Check input data with np.asarray(data).`

十年热恋 提交于 2019-11-30 16:34:14

问题


I have pandas dataframe with some categorical predictors (i.e. variables) as 0 & 1, and some numeric variables. When I fit that to a stasmodel like:

est = sm.OLS(y, X).fit()

It throws:

Pandas data cast to numpy dtype of object. Check input data with np.asarray(data). 

I converted all the dtypes of the DataFrame using df.convert_objects(convert_numeric=True)

After this all dtypes of dataframe variables appear as int32 or int64. But at the end it still shows dtype: object, like this:

4516        int32
4523        int32
4525        int32
4531        int32
4533        int32
4542        int32
4562        int32
sex         int64
race        int64
dispstd     int64
age_days    int64
dtype: object

Here 4516, 4523 are variable labels.

Any idea? I need to build a multi-regression model on more than hundreds of variables. For that I have concatenated 3 pandas DataFrames to come up with final DataFrame to be used in model building.


回答1:


If X is your dataframe, try using the .astype method to convert to float when running the model:

est = sm.OLS(y, X.astype(float)).fit()



回答2:


if both y(dependent) and X are taken from a data frame then type cast both:-

est = sm.OLS(y.astype(float), X.astype(float)).fit()


来源:https://stackoverflow.com/questions/33833832/building-multi-regression-model-throws-error-pandas-data-cast-to-numpy-dtype-o

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!