statsmodels | 易学教程

ARMAX model forecasting leads to “ValueError: matrices are not aligned” when passing exog values

阅读更多关于 ARMAX model forecasting leads to “ValueError: matrices are not aligned” when passing exog values

问题 I'm struggling with forecasting out of sample values with an ARMAX model. Fitting the model works fine. armax_mod31 = sm.tsa.ARMA(endog = sales, order = (3,1), exog = media).fit() armax_mod31.fittedvalues Forecasting without exogenous values, as far as I have an according model, works fine as well. arma_mod31 = sm.tsa.ARMA(sales, (3,1)).fit() all_arma = arma_mod31.forecast(steps = 14, alpha = 0.05) forecast_arma = Series(res_arma[0], index = pd.date_range(start = "2013-08-21", periods = 14))

invert a log/diff transform for plotting

阅读更多关于 invert a log/diff transform for plotting

问题 tl-dr; for app in endog: min_nonzero = series[series[app] > 0].min()[0] series.loc[series[app] == 0, app] = min_nonzero series[app + '_log_diff'] = np.log(series[app]).diff() series = series.replace([np.inf, -np.inf], np.nan).dropna() how to invert that for plotting? full text I'm having trouble with inverting my log transposition to remove stationarity. Here's the transpose: series = u[columns].copy() endogdiffs = [] for app in endog: min_nonzero = series[series[app] > 0].min()[0] series.loc

Statsmodels MANOVA : IndexError: index 1 is out of bounds for axis 0 with size 1

阅读更多关于 Statsmodels MANOVA : IndexError: index 1 is out of bounds for axis 0 with size 1

问题 I have spent hours trying to make statsmodels do my MANOVA without success. Here is the code: from statsmodels.multivariate.manova import MANOVA df = data feats_list = ['col1', 'col2', 'col3' ... 'col4'] var_list = ['col5', 'col6'] endog, exog = np.asarray(df[feats_list]), np.asarray(df[var_list]) manov = MANOVA(endog, exog) manov.mv_test() Providing: --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-16

How to find out the slope value by applying linear regression on trend of a data?

阅读更多关于 How to find out the slope value by applying linear regression on trend of a data?

问题 I have a time series data from which I am able to find out the trend .Now I need to put a regression line which fits the best for the trend data and would like the know whether the slope is +ve or -ve or constant.Below is my csv file which contains the data date,cpu 2018-02-10 11:52:59.342269+00:00,6.0 2018-02-10 11:53:04.006971+00:00,6.0 2018-02-10 22:35:33.438948+00:00,4.0 2018-02-10 22:35:37.905242+00:00,4.0 2018-02-11 12:01:00.663084+00:00,4.0 2018-02-11 12:01:05.136107+00:00,4.0 2018-02

Multivariate Breusch Godfrey Lagrange Multiplier tests in Python

阅读更多关于 Multivariate Breusch Godfrey Lagrange Multiplier tests in Python

问题 I understand that in the package statsmodel has many statistical functions that enable one to test for many issues including Breusch Godfrey Lagrange test as described here However, as far as I am concerned this only do the job for univariate case and not the multivariate case. For example, consider I have a 2 diminsional dataset say data from statsmodels.tsa.api import VAR import statsmodels.api as sm,statsmodels as sm1 data= np.random.random((108, 2)) Model=VAR(data) results = Model.fit(1)

Inter-rater reliability calculation for multi-raters data

阅读更多关于 Inter-rater reliability calculation for multi-raters data

问题 I have the following list of lists: [[1, 1, 1, 1, 3, 0, 0, 1], [1, 1, 1, 1, 3, 0, 0, 1], [1, 1, 1, 1, 2, 0, 0, 1], [1, 1, 0, 2, 3, 1, 0, 1]] Where I want to calculate an inter-rater reliability score, there are multiple raters(rows). I cannot use Fleiss' kappa, since the rows do not sum to the same number. What is a good approach in this case? 回答1: The basic problem here is that you have not properly applied the data you're given. See here for the proper organization. You have four categories

Getting a simple predict from OLS something different from .6 to .8 of StatsModels

阅读更多关于 Getting a simple predict from OLS something different from .6 to .8 of StatsModels

问题 Sorry for cross posting this but can't get past it I cannot get output from the predict function: I have an OLS model that used to work with SM .6 and now not working in .8 and Pandas increased from 19.2 to 20.3 so that could be the issue? I just don't understand what I need to feed to the predict method. So my model create looks like: def fit_line2(x, y): X = sm.add_constant(x, prepend=True) #Add a column of ones to allow the calculation of the intercept ols_test = sm.OLS(y, X,missing='drop'

Python statsmodels trouble getting fitted model parameters

阅读更多关于 Python statsmodels trouble getting fitted model parameters

问题 I'm using an AR model to fit my data and I think that I have done that successfully, but now I want to actually see what the fitted model parameters are and I am running into some trouble. Here is my code model=ar.AR(df['price'],freq='M') ar_res=model.fit(maxlags=50,ic='bic') which runs without any error. However when I try to print the model parameters with the following code print ar_res.params I get the error AssertionError: Index length did not match values 回答1: I am unable to reproduce

How exactly BIC in Augmented Dickey–Fuller test work in Python?

阅读更多关于 How exactly BIC in Augmented Dickey–Fuller test work in Python?

问题 This question is on Augmented Dickey–Fuller test implementation in statsmodels.tsa.stattools python library - adfuller(). In principle, AIC and BIC are supposed to compute information criterion for a set of available models and pick up the best (the one with the lowest information loss). But how do they operate in the context of Augmented Dickey–Fuller? The thing which I don't get: I've set maxlag=30, BIC chose lags=5 with some informational criterion. I've set maxlag=40 - BIC still chooses

Robustness issue of statsmodel Linear regression (ols) - Python

阅读更多关于 Robustness issue of statsmodel Linear regression (ols) - Python

问题 I was testing some basic category regression using Stats model: I build up a deterministic model Y = X + Z where X can takes 3 values (a, b or c) and Z only 2 (d or e). At that stage the model is purely deterministic, I setup the weights for each variable as followed a's weight=1 b's weight=2 c's weight=3 d's weight=1 e's weight=2 Therefore with 1(X=a) being 1 if X=a, 0 otherwise, the model is simply: Y = 1(X=a) + 2*1(X=b) + 3*1(X=c) + 1(Z=d) + 2*1(Z=e) Using the following code, to generate