What to use to do multiple correlation?

て烟熏妆下的殇ゞ 提交于 2019-12-20 23:57:28

问题


I am trying to use python to compute multiple linear regression and multiple correlation between a response array and a set of arrays of predictors. I saw the very simple example to compute multiple linear regression, which is easy. But how to compute multiple correlation with statsmodels? or with anything else, as an alternative. I guess i could use rpy and R, but i'd prefer to stay in python if possible.

edit [clarification]: Considering a situation like the one described here: http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704-EP713_MultivariableMethods/ I would like to compute also multiple correlation coefficients for the predictors, in addition to the regression coefficients and the other regression parameters


回答1:


You could certainly do this with statsmodels and pandas. Something like this might get you started

import pandas
import statsmodels.api as sm
from statsmodels.formula.api import ols

data = pandas.DataFrame([["A", 4, 0, 1, 27], 
                         ["B", 7, 1, 1, 29], 
                         ["C", 6, 1, 0, 23], 
                         ["D", 2, 0, 0, 20], 
                         ["etc.", 3, 0, 1, 21]], 
                         columns=["ID", "score", "male", "age20", "BMI"])
print data.corr()

model = ols("BMI ~ score + male + age20", data=data).fit()
print model.params
print model.summary()

Have a look at the documentation:

http://statsmodels.sourceforge.net/devel/

http://pandas.pydata.org/

Edit: I'm not familiar with the terminology multiple correlation coefficient, but I believe this is just square root of the R-squared of a multiple regression model no?

print model.rsquared**.5
print model.rsquared_adj**.5

Is this what you're after?



来源:https://stackoverflow.com/questions/13452353/what-to-use-to-do-multiple-correlation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!