statsmodels | 易学教程

What to use to do multiple correlation?

阅读更多关于 What to use to do multiple correlation?

问题 I am trying to use python to compute multiple linear regression and multiple correlation between a response array and a set of arrays of predictors. I saw the very simple example to compute multiple linear regression, which is easy. But how to compute multiple correlation with statsmodels? or with anything else, as an alternative. I guess i could use rpy and R, but i'd prefer to stay in python if possible. edit [clarification]: Considering a situation like the one described here: http:/

bug of autocorrelation plot in matplotlib‘s plt.acorr?

阅读更多关于 bug of autocorrelation plot in matplotlib‘s plt.acorr?

问题 I am plotting autocorrelation with python. I used three ways to do it: 1. pandas, 2. matplotlib, 3. statsmodels. I found the graph I got from matplotlib is not consistent with the other two. The code is: from statsmodels.graphics.tsaplots import * # print out data print mydata.values #1. pandas p=autocorrelation_plot(mydata) plt.title('mydata') #2. matplotlib fig=plt.figure() plt.acorr(mydata,maxlags=150) plt.title('mydata') #3. statsmodels.graphics.tsaplots.plot_acf plot_acf(mydata) plt

Why do I get only one parameter from a statsmodels OLS fit

阅读更多关于 Why do I get only one parameter from a statsmodels OLS fit

问题 Here is what I am doing: $ python Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin >>> import statsmodels.api as sm >>> statsmodels.__version__ '0.5.0' >>> import numpy >>> y = numpy.array([1,2,3,4,5,6,7,8,9]) >>> X = numpy.array([1,1,2,2,3,3,4,4,5]) >>> res_ols = sm.OLS(y, X).fit() >>> res_ols.params array([ 1.82352941]) I had expected an array with two elements?!? The intercept and the slope coefficient? 回答1: Try this: X = sm

ADF test in statsmodels in Python

阅读更多关于 ADF test in statsmodels in Python

问题 I am trying to run a Augmented Dickey-Fuller test in statsmodels in Python, but I seem to be missing something. This is the code that I am trying: import numpy as np import statsmodels.tsa.stattools as ts x = np.array([1,2,3,4,3,4,2,3]) result = ts.adfuller(x) I get the following error: Traceback (most recent call last): File "C:\Users\Akavall\Desktop\Python\Stats_models\stats_models_test.py", line 12, in <module> result = ts.adfuller(x) File "C:\Python27\lib\site-packages\statsmodels-0.4.1

ADF test in statsmodels in Python

阅读更多关于 ADF test in statsmodels in Python

Appending predicted values and residuals to pandas dataframe

阅读更多关于 Appending predicted values and residuals to pandas dataframe

问题 It's a useful and common practice to append predicted values and residuals from running a regression onto a dataframe as distinct columns. I'm new to pandas, and I'm having trouble performing this very simple operation. I know I'm missing something obvious. There was a very similar question asked about a year-and-a-half ago, but it wasn't really answered. The dataframe currently looks something like this: y x1 x2 880.37 3.17 23 716.20 4.76 26 974.79 4.17 73 322.80 8.70 72 1054.25 11.45 16 And

Python Statsmodel ARIMA start [stationarity]

阅读更多关于 Python Statsmodel ARIMA start [stationarity]

问题 I just began working on time series analysis using statsmodels. I have a dataset with dates and values (for about 3 months). I am facing some issues with providing the right order to the ARIMA model. I am looking to adjust for trends and seasonality and then compute outliers. My 'values' are not stationary and statsmodel says that I have to either induce stationarity or provide some differencing to make it work. I played around with different ordering (without understanding deeply about the

Decomposing trend, seasonal and residual time series elements

阅读更多关于 Decomposing trend, seasonal and residual time series elements

问题 I have a DataFrame with a few time series: divida movav12 var varmovav12 Date 2004-01 0 NaN NaN NaN 2004-02 0 NaN NaN NaN 2004-03 0 NaN NaN NaN 2004-04 34 NaN inf NaN 2004-05 30 NaN -0.117647 NaN 2004-06 44 NaN 0.466667 NaN 2004-07 35 NaN -0.204545 NaN 2004-08 31 NaN -0.114286 NaN 2004-09 30 NaN -0.032258 NaN 2004-10 24 NaN -0.200000 NaN 2004-11 41 NaN 0.708333 NaN 2004-12 29 24.833333 -0.292683 NaN 2005-01 31 27.416667 0.068966 0.104027 2005-02 28 29.750000 -0.096774 0.085106 2005-03 27 32

Python statsmodels OLS: how to save learned model to file

阅读更多关于 Python statsmodels OLS: how to save learned model to file

问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I am trying to learn an ordinary least squares model using Python's statsmodels library, as described here. sm.OLS.fit() returns the learned model. Is there a way to save it to the file and reload it? My training data is huge and it takes around half a minute to learn the model. So I was wondering if any save/load capability exists in OLS model. I tried the repr() method on the

Why would R-Squared decrease when I add an exogenous variable in OLS using python statsmodels

阅读更多关于 Why would R-Squared decrease when I add an exogenous variable in OLS using python statsmodels

问题 If I understand the OLS model correctly, this should never be the case? trades['const']=1 Y = trades['ret']+trades['comms'] #X = trades[['potential', 'pVal', 'startVal', 'const']] X = trades[['potential', 'pVal', 'startVal']] from statsmodels.regression.linear_model import OLS ols=OLS(Y, X) res=ols.fit() res.summary() If I turn the const on, I get a rsquared of 0.22 and with it off, I get 0.43. How is that even possible? 回答1: see the answer here Statsmodels: Calculate fitted values and R