可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I just began working on time series analysis using statsmodels. I have a dataset with dates and values (for about 3 months). I am facing some issues with providing the right order to the ARIMA model. I am looking to adjust for trends and seasonality and then compute outliers.

My 'values' are not stationary and statsmodel says that I have to either induce stationarity or provide some differencing to make it work. I played around with different ordering (without understanding deeply about the consequences of changing p,q and d).

When I introduce 1 for differencing, I get this error:

ValueError: The start index -1 of the original series has been differenced away

When I remove the differencing by having my order as (say) order = (2,0,1), I get this error:

    raise ValueError("The computed initial AR coefficients are not " ValueError: The computed initial AR coefficients are not stationary You should induce stationarity, choose a different model order, or you can pass your own start_params. >>>

Any help on how to induce stationarity (or a link to a nice tutorial) would be helpful. And, also, tests of stationarity (like, http://www.maths.bris.ac.uk/~guy/Research/LSTS/TOS.html) would be useful.

Update: I am reading through ADF test:

http://statsmodels.sourceforge.net/stable/generated/statsmodels.tsa.stattools.adfuller.html

Thanks! PD.

回答1:

To induce stationarity:

de-seasonalize (remove seasonality)
de-trend (remove trend)

There are several ways to achieve stationarity of a time series - Box-Cox family of transformations, Differencing etc., Choice of method depends on the data. Below are the commonly used tests for stationarity.

Tests for stationarity: 1. Augmented Dickey-Fuller test 2. KPSS test KPSS python code

回答2:

You can use R script instead statmodels. R is more powerful for statistical estimation.

If you want use python, you can run R-script from a python through os interface:

for example R script for arima estimation "arimaestimation.r":

library(rjson)  args <- commandArgs(trailingOnly=TRUE)  jsonstring = ''  for(i in seq(0, length(args))) {     if ( length(args[i]) && args[i]=='--jsondata' ) {         jsonstring = args[i+1]     } }  jsonobject = fromJSON(jsonstring) data = as.numeric(unlist(jsonobject['data'])) p = as.numeric(unlist(jsonobject['p'])) d = as.numeric(unlist(jsonobject['d'])) q = as.numeric(unlist(jsonobject['q']))  estimate = arima(data, order=c(p, d, q))  phi = c() if (p>0) {     for (i in seq(1, p)) {         phi = c(phi, as.numeric(unlist(estimate$coef[i])))     } } theta = c() if (p+1 <= p+q) {     for (i in seq(p+1, p+q)) {         theta = c(theta, as.numeric(unlist(estimate$coef[i])))     } } if (d==0) {     intercept = as.numeric(unlist(estimate$coef[p+q+1])) } else {     intercept = 0.0 }  if (length(phi)) {     if (length(phi)==1) {         phi = list(phi)     } } else {     phi = list() }  if (length(theta)) {     if (length(theta)==1) {         theta = list(-1 * theta)     } else {         theta = -1 * theta     } } else {     theta = list() }  arimapredict = predict(estimate, n.ahead = 12) prediction = as.numeric(unlist(arimapredict$pred)) predictionse = as.numeric(unlist(arimapredict$se))  response = list(phi=phi,                 theta=theta,                 intercept=intercept,                 sigma2=estimate$sigma2,                 aic=estimate$aic,                 prediction=prediction,                 predictionse=predictionse)  cat(toJSON(response))

And call him with python through json interface:

Rscript arima/arimaestimate.r --jsondata '{"q": 2, "p": 2, "data": [247.0, 249.0, 213.0, 154.0, 122.0, 164.0, 141.0, 174.0, 281.0, 141.0, 159.0, 168.0, 243.0, 261.0, 211.0, 303.0, 308.0, 239.0, 237.0, 185.0], "d": 1}'

and you get the answer:

{     "phi": [],     "theta": [         0.407851844478153     ],     "intercept": 0,     "sigma2": 3068.29837379914,     "aic": 210.650287294343,     "prediction": [         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721,         210.184175597721     ] }

文章来源: Python Statsmodel ARIMA start [stationarity]

标签

arima

python