statsmodels | 易学教程

ImportError: DLL load failed: when importing statsmodels [duplicate]

阅读更多关于 ImportError: DLL load failed: when importing statsmodels [duplicate]

This question already has an answer here: Installing scipy in Python 3.5 on 32-bit Windows 7 Machine 4 answers My Python version is 3.5 on win32. I successfully installed Numpy+MKL, Scipy and Statsmodels from here http://www.lfd.uci.edu/~gohlke/pythonlibs/ However, when I run import statsmodels as sm I get the following error: Traceback (most recent call last): File "D:\Python\Innovation\try\Try_Reg.py", line 6, in <module> import statsmodels as sm File "C:\Python35\lib\site-packages\statsmodels\__init__.py", line 8, in <module> from .tools.sm_exceptions import (ConvergenceWarning,

Python Negative Binomial Regression - Results Don't Match those from R

阅读更多关于 Python Negative Binomial Regression - Results Don't Match those from R

问题 I'm experimenting with negative binomial regression using Python. I found this example using R, along with a data set: http://www.karlin.mff.cuni.cz/~pesta/NMFM404/NB.html I tried to replicate the results on the web page using this code: import pandas as pd import statsmodels.formula.api as smf import statsmodels.api as sm df = pd.read_stata("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/nb_data.dta") model = smf.glm(formula = "daysabs ~ math + prog", data=df, family=sm.families

Python: Weighted coefficient of variation

阅读更多关于 Python: Weighted coefficient of variation

How can I calculate the weighted coefficient of variation (CV) over a NumPy array in Python? It's okay to use any popular third-party Python package for this purpose. I can calculate the CV using scipy.stats.variation , but it's not weighted. import numpy as np from scipy.stats import variation arr = np.arange(-5, 5) weights = np.arange(9, -1, -1) # Same size as arr cv = abs(variation(arr)) # Isn't weighted This can be done using the statsmodels.stats.weightstats.DescrStatsW class in the statsmodels package for calculating weighted statistics . from statsmodels.stats.weightstats import

How to apply OLS from statsmodels to groupby

阅读更多关于 How to apply OLS from statsmodels to groupby

I am running OLS on products by month. While this works fine for a single product, my dataframe contains many products. If I create a groupby object OLS gives an error. linear_regression_df: product_desc period_num TOTALS 0 product_a 1 53 3 product_a 2 52 6 product_a 3 50 1 product_b 1 44 4 product_b 2 43 7 product_b 3 41 2 product_c 1 36 5 product_c 2 35 8 product_c 3 34 from pandas import DataFrame, Series import statsmodels.api as sm linear_regression_grouped = linear_regression_df.groupby(['product_desc']) X = linear_regression_grouped['period_num'] y = linear_regression_grouped['TOTALS']

Python: Weighted coefficient of variation

阅读更多关于 Python: Weighted coefficient of variation

问题 How can I calculate the weighted coefficient of variation (CV) over a NumPy array in Python? It's okay to use any popular third-party Python package for this purpose. I can calculate the CV using scipy.stats.variation, but it's not weighted. import numpy as np from scipy.stats import variation arr = np.arange(-5, 5) weights = np.arange(9, -1, -1) # Same size as arr cv = abs(variation(arr)) # Isn't weighted 回答1: This can be done using the statsmodels.stats.weightstats.DescrStatsW class in the

Statsmodels Categorical Data from Formula (using pandas)

阅读更多关于 Statsmodels Categorical Data from Formula (using pandas)

问题 I am trying to finish up a homework assignment and to do so I need to use categorical variables in statsmodels (due to a refusal to conform to using stata like everyone else). I have spent some time reading through documentation for both Patsy and Statsmodels and I can't quite figure out why this snippet of code isn't working. I have tried breaking them down and creating it with the patsy commands, but come up with the same error. I currently have: import numpy as np import pandas as pd

Statsmodel,基于Pandas的统计模型库

阅读更多关于 Statsmodel,基于Pandas的统计模型库

Statsmodels 是Python的统计建模和计量经济学工具包，包括一些描述统计、统计模型估计和推断。主页： http://www.statsmodels.org/stable/index.html 源代码： https://github.com/statsmodels/statsmodels Python分发库： https://pypi.python.org/pypi/statsmodels/ 这篇文章是Statsmodels系列文章的第一篇，主要介绍一下 Statsmodels能干什么，以方便一些初学者选择是否需要学习该模块。之后我会发布一些列入门教程，一是作为笔记自己查看，而是作为教程可供学者快速入门，下面我们来看看Statsmodels有啥特性吧。 Liner regression models：线性回归模型 Gneralized linear models：一般线型模型，主要用于各种设计的方差分析 robust linear models: Discrete choice models：离散选择模型，logit模型属于离散选择模型，主要用于微观计量经济学范畴 ANOVA：方差分析模型 Time series analysis：时间序列分析 Nonparametric estimators：非参检验 a wide range of statistical

Python Statsmodels Mixedlm (Mixed Linear Model) random effects

阅读更多关于 Python Statsmodels Mixedlm (Mixed Linear Model) random effects

I am a bit confused about the output of Statsmodels Mixedlm and am hoping someone could explain. I have a large dataset of single family homes, including the previous two sale prices/sale dates for each property. I have geocoded this entire dataset and fetched the elevation for each property. I am trying to understand the way in which the relationship between elevation and property price appreciation varies between different cities. I have used statsmodels mixed linear model to regress price appreciation on elevation, holding a number of other factors constant, with cities as my groups

Performing analysis of covariance with python/scipy/statsmodel

阅读更多关于 Performing analysis of covariance with python/scipy/statsmodel

问题 Could anyone please help in providing an example showing how ANCOVA (analysis of covariance) can be done in scipy/statsmodel, with python? I am not sure if I am asking too much, but a quick search showed me this which is not informative enough for me. Thanks! 回答1: Statsmodels uses the linear model, OLS, to estimate ANOVA. So, having additional continuous regressors as in ANCOVA does not change the analysis. Here are a few links to the relevant documentation Anova helper functions and examples

Why do R and statsmodels give slightly different ANOVA results?

阅读更多关于 Why do R and statsmodels give slightly different ANOVA results?

Using a small R sample dataset and the ANOVA example from statsmodels , the degrees of freedom for one of the variables are reported differently, & the F-values results are also slightly different. Perhaps they have slightly different default approaches? Can I set up statsmodels to use R's defaults? import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols ##R code on R sample dataset #> anova(with(ChickWeight, lm(weight ~ Time + Diet))) #Analysis of Variance Table # #Response: weight # Df Sum Sq Mean Sq F value Pr(>F) #Time 1 2042344 2042344 1576.460 < 2.2e-16 *