statsmodels | 易学教程

VAR model with pandas + statsmodels in Python

阅读更多关于 VAR model with pandas + statsmodels in Python

问题 I am an avid user of R, but recently switched to Python for a few different reasons. However, I am struggling a little to run the vector AR model in Python from statsmodels. Q#1. I get an error when I run this, and I have a suspicion it has something to do with the type of my vector. import numpy as np import statsmodels.tsa.api from statsmodels import datasets import datetime as dt import pandas as pd from pandas import Series from pandas import DataFrame import os df = pd.read_csv('myfile

Python-Statsmodels ARIMA Out-Of-Sample forecast raises ValueError: no rule for interpreting end

阅读更多关于 Python-Statsmodels ARIMA Out-Of-Sample forecast raises ValueError: no rule for interpreting end

问题 I am trying to use the statsmodels Python library, and in particular the ARIMA model. I have some troubles performing out-of-sample forecast with it. Here is my call: predicted = my_arima_result.predict(start=startDateTime, end=endDateTime) From my understanding, I need to pass a start and end parameters; but when passing a end parameter I get the following error: ValueError: no rule for interpreting end I did some tests and the call only works for in-sample calls, passing at most the start

when installing statsmodels, I get the following error:RuntimeError: dictionary changed size during iteration

阅读更多关于 when installing statsmodels, I get the following error:RuntimeError: dictionary changed size during iteration

问题 I have read a lot of posts about this error, and the reason I am posting this is because I get the error when trying to install statsmodels package, and not one of my programs. how do I correct the error when installing a package? $ sudo pip3 install statsmodels Downloading/unpacking statsmodels Downloading statsmodels-0.5.0.tar.gz (5.5MB): 5.5MB downloaded Running setup.py (path:/tmp/pip_build_root/statsmodels/setup.py) egg_info for package statsmodels Traceback (most recent call last): File

OLS of statsmodels does not work with inversely proportional data?

阅读更多关于 OLS of statsmodels does not work with inversely proportional data?

问题 I'm trying to perform a Ordinary Least Squares Regression with some inversely proportional data, but seems like the fitting result is wrong? import statsmodels.formula.api as sm import numpy as np import matplotlib.pyplot as plt y = np.arange(100, 0, -1) x = np.arange(0, 100) result = sm.OLS(y, x).fit() fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(20, 4), sharey=True) ax.plot(x, result.fittedvalues, 'r-') ax.plot(x, y, 'x') fig.show() 回答1: You're not adding a constant as the

Specifying a Constant in Statsmodels Linear Regression?

阅读更多关于 Specifying a Constant in Statsmodels Linear Regression?

问题 I want to use the statsmodels.regression.linear_model.OLS package to do a prediction, but with a specified constant. Currently, I can specify the presence of a constant with an argument: (from docs: http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.OLS.html) class statsmodels.regression.linear_model.OLS(endog, exog=None, missing='none', hasconst=None), where hasconst is a boolean. What I want to do is specify explicitly a constant C, and then fit a linear

Why does my Pandas join shift rows of the joined data?

阅读更多关于 Why does my Pandas join shift rows of the joined data?

问题 In Pandas, when I join , the joined data is misaligned with respect to the original DataFrame: import os import pandas as pd import statsmodels.formula.api as sm import numpy as np import matplotlib.pyplot as plt flu_train = pd.read_csv('FluTrain.csv') # From: https://courses.edx.org/c4x/MITx/15.071x/asset/FluTrain.csv cols = ['Ystart', 'Mstart', 'Dstart', 'Yend', 'Mend', 'Dend'] flu_train = flu_train.join(pd.DataFrame(flu_train.Week.str.findall('\d+').tolist(), dtype=np.int64, columns=cols))

python OLS statsmodels T Stats of variables not entered into the model

阅读更多关于 python OLS statsmodels T Stats of variables not entered into the model

问题 Hi have created a OLS regression using Statsmodels I've written some code that loops through every variable in a dataframe and enters it into the model and then records the T Stat in a new dataframe and builds a list of potential variables. However I have 20,000 variables so it takes ages to run each time. Can anyone think of a better approach? This is my current approach TStatsOut=pd.DataFrame() for i in VarsOut: try: xstrout='+'.join([baseterms,i]) fout='ymod~'+xstrout modout = smf.ols(fout

Best model for variable selection with big data?

阅读更多关于 Best model for variable selection with big data?

问题 I posted a question earlier about some code but now I realize I should be more broad with the general idea. Basically, I'm trying to build a statistical model with about 1000 observations and 2000 variables. I would like to determine which variables are most influential in effecting my dependent variable with high significance. I don't plan to use the model for prediction, just for variable selection. My independent variables are binary and dependent variable is continuous. I've tried

Weekday as dummy / factor variable in a linear regression model using statsmodels

阅读更多关于 Weekday as dummy / factor variable in a linear regression model using statsmodels

问题 The question: How can I add a dummy / factor variable to a model using sm.OLS() ? The details Below is a reproducible dataframe that you can pick up using ctrl + C and then run the snippet further down for a reproducible example. Input data: Date A B weekday 2013-05-04 25.03 88.51 Saturday 2013-05-05 52.98 67.99 Sunday 2013-05-06 39.93 75.19 Monday 2013-05-07 47.31 86.99 Tuesday 2013-05-08 19.61 87.94 Wednesday 2013-05-09 39.51 83.10 Thursday 2013-05-10 21.22 62.16 Friday 2013-05-11 19.04 58

linear regression model with AR errors python

阅读更多关于 linear regression model with AR errors python

问题 Is there a python package (statsmodels/scipy/pandas/etc...) with functionality for estimating coefficients for a linear regression model with autoregressive errors in python, such as the following SAS implementation below? http://support.sas.com/documentation/cdl/en/etsug/63348/HTML/default/viewer.htm#etsug_autoreg_sect003.htm 回答1: statsmodels http://www.statsmodels.org/dev/index.html has ARMA, ARIMA and SARIMAX models that take explanatory variables to model the mean. This corresponds to a