statsmodels | 易学教程

Using statsmodels.seasonal_decompose() without DatetimeIndex but with Known Frequency

阅读更多关于 Using statsmodels.seasonal_decompose() without DatetimeIndex but with Known Frequency

问题 I have a time-series signal I would like to decompose in Python, so I turned to statsmodels.seasonal_decompose(). My data has frequency of 48 (half-hourly). I was getting the same error as this questioner, where the solution was to change from an Int index to a DatetimeIndex. But I don't know the actual dates/times my data is from. In this github thread, one of the statsmodels contributors says that "In 0.8, you should be able to specify freq as keyword argument to override the index." But

statsmodels installation: No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils

阅读更多关于 statsmodels installation: No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils

问题 It seems to be a common problem but none of the answers seems help. The error is popping up when installing statsmodels with in Windows 10 (python 3.6.2 installed): python setup.py install Before that, numpy has been installed: python numpy install There was no error and I assume it was a success.But the installation of statsmodels still have the error with statsmodels installation. I did install MS c++ compiler (2015). Also I installed latest Anaconda (python 3.6.1) and it did not help. The

Python - find where the plot crosses the axhline on python plot

阅读更多关于 Python - find where the plot crosses the axhline on python plot

问题 I am doing some analysis on some simple data, and I am trying to plot auto-correlation and partial auto-correlation. Using these plots, I am trying to find the P and Q value to plot in my ARIMA model. I can see on the graphs, but I am wondering if I can explicitly find, for each graph, where the plot crosses the axhline? plt.subplot(122) plt.plot(lag_pacf) plt.axhline(y=0, linestyle = '--', color = 'grey') plt.axhline(y=-1.96/np.sqrt(len(log_moving_average_difference)),linestyle = '--',color

ECDF in python without step function?

阅读更多关于 ECDF in python without step function?

问题 I have been using ECDF (empirical cumulative distribution function) from statsmodels.distributions to plot a CDF of some data. However, ECDF uses a step function and as a consequence I get jagged-looking plots. So my question is: Do scipy or statsmodels have a ECDF baked-in without a step function? By the way, I know I can do this: hist, bin_edges = histogram(b_oz, normed=True) plot(np.cumsum(hist)) but I don't get the right scales. Thanks! 回答1: If you just want to change the plot, then you

Get Durbin-Watson and Jarque-Bera statistics from OLS Summary in Python

阅读更多关于 Get Durbin-Watson and Jarque-Bera statistics from OLS Summary in Python

问题 I am running the OLS summary for a column of values. Part of the OLS is the Durbin-Watson and Jarque-Bera (JB) statistics and I want to pull those values out directly since they have already been calculated rather than running the steps as extra steps like I do now with durbinwatson. Here is the code I have: import pandas as pd import statsmodels.api as sm csv = mydata.csv df = pd.read_csv(csv) var = df[variable] year = df['Year'] model = sm.OLS(var,year) results = model.fit() summary =

How to invert differencing in a Python statsmodels ARIMA forecast?

阅读更多关于 How to invert differencing in a Python statsmodels ARIMA forecast?

问题 I'm trying to wrap my head around ARIMA forecasting using Python and Statsmodels. Specifically, for the ARIMA algorithm to work, the data needs to be made stationary via differencing (or similar method). The question is: How does one invert the differencing after the residual forecast has been made to get back to a forecast including the trend and seasonality that was differenced out? (I saw a similar question here but alas, no answers have been posted.) Here's what I've done so far (based on

Why am I getting “LinAlgError: Singular matrix” from grangercausalitytests?

阅读更多关于 Why am I getting “LinAlgError: Singular matrix” from grangercausalitytests?

问题 I am trying to run grangercausalitytests on two time series: import numpy as np import pandas as pd from statsmodels.tsa.stattools import grangercausalitytests n = 1000 ls = np.linspace(0, 2*np.pi, n) df1 = pd.DataFrame(np.sin(ls)) df2 = pd.DataFrame(2*np.sin(1+ls)) df = pd.concat([df1, df2], axis=1) df.plot() grangercausalitytests(df, maxlag=20) However, I am getting Granger Causality number of lags (no zero) 1 ssr based F test: F=272078066917221398041264652288.0000, p=0.0000 , df_denom=996,

Python ARIMA exogenous variable out of sample

阅读更多关于 Python ARIMA exogenous variable out of sample

问题 I am trying to predict a time series in python statsmodels ARIMA package with the inclusion of an exogenous variable, but cannot figure out the correct way to insert the exogenous variable in the predict step. See here for docs. import numpy as np from scipy import stats import pandas as pd import statsmodels.api as sm vals = np.random.rand(13) ts = pd.TimeSeries(vals) df = pd.DataFrame(ts, columns=["test"]) df.index = pd.Index(pd.date_range("2011/01/01", periods = len(vals), freq = 'Q'))

Difference between the interaction : and * term for formulas in StatsModels OLS regression

阅读更多关于 Difference between the interaction : and * term for formulas in StatsModels OLS regression

问题 Hi I'm learning Statsmodel and can't figure out the difference between : and * (interaction terms) for formulas in StatsModels OLS regression. Could you please give me a hint to figure this out? Thank you! The documentation: http://statsmodels.sourceforge.net/devel/example_formulas.html 回答1: ":" will give a regression without the level itself. just the interaction you have mentioned. "*" will give a regression with the level itself + the interaction you have mentioned. for example a .

Converting statsmodels summary object to Pandas Dataframe

阅读更多关于 Converting statsmodels summary object to Pandas Dataframe

问题 I am doing multiple linear regression with statsmodels.formula.api (ver 0.9.0) on Windows 10. After fitting the model and getting the summary with following lines i get summary in summary object format. X_opt = X[:, [0,1,2,3]] regressor_OLS = sm.OLS(endog= y, exog= X_opt).fit() regressor_OLS.summary() OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.951 Model: OLS Adj. R-squared: 0.948 Method: Least Squares F