statsmodels

What to use to do multiple correlation?

て烟熏妆下的殇ゞ 提交于 2019-12-20 23:57:28
问题 I am trying to use python to compute multiple linear regression and multiple correlation between a response array and a set of arrays of predictors. I saw the very simple example to compute multiple linear regression, which is easy. But how to compute multiple correlation with statsmodels? or with anything else, as an alternative. I guess i could use rpy and R, but i'd prefer to stay in python if possible. edit [clarification]: Considering a situation like the one described here: http:/

bug of autocorrelation plot in matplotlib‘s plt.acorr?

不打扰是莪最后的温柔 提交于 2019-12-20 08:50:19
问题 I am plotting autocorrelation with python. I used three ways to do it: 1. pandas, 2. matplotlib, 3. statsmodels. I found the graph I got from matplotlib is not consistent with the other two. The code is: from statsmodels.graphics.tsaplots import * # print out data print mydata.values #1. pandas p=autocorrelation_plot(mydata) plt.title('mydata') #2. matplotlib fig=plt.figure() plt.acorr(mydata,maxlags=150) plt.title('mydata') #3. statsmodels.graphics.tsaplots.plot_acf plot_acf(mydata) plt

Why do I get only one parameter from a statsmodels OLS fit

為{幸葍}努か 提交于 2019-12-20 08:28:25
问题 Here is what I am doing: $ python Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin >>> import statsmodels.api as sm >>> statsmodels.__version__ '0.5.0' >>> import numpy >>> y = numpy.array([1,2,3,4,5,6,7,8,9]) >>> X = numpy.array([1,1,2,2,3,3,4,4,5]) >>> res_ols = sm.OLS(y, X).fit() >>> res_ols.params array([ 1.82352941]) I had expected an array with two elements?!? The intercept and the slope coefficient? 回答1: Try this: X = sm

ADF test in statsmodels in Python

耗尽温柔 提交于 2019-12-19 08:08:30
问题 I am trying to run a Augmented Dickey-Fuller test in statsmodels in Python, but I seem to be missing something. This is the code that I am trying: import numpy as np import statsmodels.tsa.stattools as ts x = np.array([1,2,3,4,3,4,2,3]) result = ts.adfuller(x) I get the following error: Traceback (most recent call last): File "C:\Users\Akavall\Desktop\Python\Stats_models\stats_models_test.py", line 12, in <module> result = ts.adfuller(x) File "C:\Python27\lib\site-packages\statsmodels-0.4.1

ADF test in statsmodels in Python

自闭症网瘾萝莉.ら 提交于 2019-12-19 08:08:04
问题 I am trying to run a Augmented Dickey-Fuller test in statsmodels in Python, but I seem to be missing something. This is the code that I am trying: import numpy as np import statsmodels.tsa.stattools as ts x = np.array([1,2,3,4,3,4,2,3]) result = ts.adfuller(x) I get the following error: Traceback (most recent call last): File "C:\Users\Akavall\Desktop\Python\Stats_models\stats_models_test.py", line 12, in <module> result = ts.adfuller(x) File "C:\Python27\lib\site-packages\statsmodels-0.4.1

Appending predicted values and residuals to pandas dataframe

Deadly 提交于 2019-12-19 03:22:08
问题 It's a useful and common practice to append predicted values and residuals from running a regression onto a dataframe as distinct columns. I'm new to pandas, and I'm having trouble performing this very simple operation. I know I'm missing something obvious. There was a very similar question asked about a year-and-a-half ago, but it wasn't really answered. The dataframe currently looks something like this: y x1 x2 880.37 3.17 23 716.20 4.76 26 974.79 4.17 73 322.80 8.70 72 1054.25 11.45 16 And

Python Statsmodel ARIMA start [stationarity]

纵饮孤独 提交于 2019-12-18 15:49:36
问题 I just began working on time series analysis using statsmodels. I have a dataset with dates and values (for about 3 months). I am facing some issues with providing the right order to the ARIMA model. I am looking to adjust for trends and seasonality and then compute outliers. My 'values' are not stationary and statsmodel says that I have to either induce stationarity or provide some differencing to make it work. I played around with different ordering (without understanding deeply about the

Decomposing trend, seasonal and residual time series elements

北战南征 提交于 2019-12-18 12:56:38
问题 I have a DataFrame with a few time series: divida movav12 var varmovav12 Date 2004-01 0 NaN NaN NaN 2004-02 0 NaN NaN NaN 2004-03 0 NaN NaN NaN 2004-04 34 NaN inf NaN 2004-05 30 NaN -0.117647 NaN 2004-06 44 NaN 0.466667 NaN 2004-07 35 NaN -0.204545 NaN 2004-08 31 NaN -0.114286 NaN 2004-09 30 NaN -0.032258 NaN 2004-10 24 NaN -0.200000 NaN 2004-11 41 NaN 0.708333 NaN 2004-12 29 24.833333 -0.292683 NaN 2005-01 31 27.416667 0.068966 0.104027 2005-02 28 29.750000 -0.096774 0.085106 2005-03 27 32

Python statsmodels OLS: how to save learned model to file

坚强是说给别人听的谎言 提交于 2019-12-18 11:55:19
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I am trying to learn an ordinary least squares model using Python's statsmodels library, as described here. sm.OLS.fit() returns the learned model. Is there a way to save it to the file and reload it? My training data is huge and it takes around half a minute to learn the model. So I was wondering if any save/load capability exists in OLS model. I tried the repr() method on the

Why would R-Squared decrease when I add an exogenous variable in OLS using python statsmodels

我是研究僧i 提交于 2019-12-18 05:02:45
问题 If I understand the OLS model correctly, this should never be the case? trades['const']=1 Y = trades['ret']+trades['comms'] #X = trades[['potential', 'pVal', 'startVal', 'const']] X = trades[['potential', 'pVal', 'startVal']] from statsmodels.regression.linear_model import OLS ols=OLS(Y, X) res=ols.fit() res.summary() If I turn the const on, I get a rsquared of 0.22 and with it off, I get 0.43. How is that even possible? 回答1: see the answer here Statsmodels: Calculate fitted values and R