Statsmodels gives different ANOVA results to SPSS

两盒软妹~` 提交于 2019-12-08 06:54:30

问题


I'm getting acquainted with Statsmodels so as to shift my more complicated stats completely over to python. However, I'm being cautious, so I'm cross-checking my results with SPSS, just to make sure I'm not making any obvious blunders. Most of time, there's no difference, but I have one example of a two-way ANOVA that's throwing up very different test statistics in Statsmodels and SPSS. (Relevant point: the sample sizes in the ANOVA are mismatched, so ANOVA may not be the appropriate model here.)

I'm selecting my model as follows:

import pandas as pd
import scipy as sp
import numpy as np
import statsmodels.api as sm
import seaborn as sns
import statsmodels
import statsmodels.api as sm
from statsmodels.formula.api import ols
import matplotlib.pyplot as plt

Body = pd.read_csv(filepath)

Body = Body.dropna()

Body_lm = ols('Effect ~ C(Fiction) + C(Condition) + C(Fiction)*C(Condition)', data = Body).fit()

table = sm.stats.anova_lm(Body_lm, typ=2)

The Statsmodels output is as below:

                            sum_sq     df           F        PR(>F)
C(Fiction)               278.176684    1.0  307.624463  1.682042e-55
C(Condition)               4.294764    1.0    4.749408  2.971278e-02
C(Fiction):C(Condition)   10.776312    1.0   11.917092  5.970123e-04
Residual                 520.861599  576.0         NaN           NaN

The corresponding SPSS results are these:

Can anyone help explain the difference? Is is perhaps the unequal sample sizes being treated differently under the hood? Or am I choosing the wrong model?

Any help appreciated!


回答1:


You should use sum coding when comparing the means of the variables. BTW you don't need to specify each variable that are in the interaction term if * multiply operator is used:

“:” adds a new column to the design matrix with the product of the other two columns.
“*” will also include the individual columns that were multiplied together.

Your model should be:

Body_lm = ols('Effect ~ C(Fiction, Sum)*C(Condition, Sum)', data = Body).fit()


来源:https://stackoverflow.com/questions/42160020/statsmodels-gives-different-anova-results-to-spss

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!