Difference in Python statsmodels OLS and R's lm

前端 未结 2 1263
遥遥无期
遥遥无期 2020-12-25 12:59

I\'m not sure why I\'m getting slightly different results for a simple OLS, depending on whether I go through panda\'s experimental rpy interface to do the regression in

相关标签:
2条回答
  • 2020-12-25 13:27

    Looks like Python does not add an intercept by default to your expression, whereas R does when you use the formula interface..

    This means you did fit two different models. Try

    lm( y ~ x - 1, data)
    

    in R to exclude the intercept, or in your case and with somewhat more standard notation

    lm(num_rx ~ ridageyr - 1, data=demoq)
    
    0 讨论(0)
  • 2020-12-25 13:40

    Note that you can still use ols from statsmodels.formula.api:

    from statsmodels.formula.api import ols
    
    results = ols('num_rx ~ ridageyr', demoq).fit()
    results.summary()
    

    I think it uses patsy in the backend to translate the formula expression, and intercept is added automatically.

    0 讨论(0)
提交回复
热议问题