statsmodels

character causing syntax issue with statsmodel

青春壹個敷衍的年華 提交于 2019-12-13 05:43:18
问题 I'm trying to fit a linear model to some data using the code below. I'm getting the error below. I think the error has an issue with the '%' in the field name. I have many fields in my data with this naming convention. Does anyone know how to solve this issue with statsmodel? code: mod = ols('fieldA%'+'~'+'fieldB',data=smp_df).fit() error: Traceback (most recent call last): File "C:\Users\username\AppDataPython\envs\py36\lib\site-packages\IPython\core\interactiveshell.py", line 3267, in run

Is there a python t test for difference? [duplicate]

末鹿安然 提交于 2019-12-13 05:07:18
问题 This question already has answers here : How to calculate the statistics “t-test” with numpy (3 answers) Closed 5 years ago . Is there a t test in a python package where you can test for difference? If there is how do you use it? For example test two vectors: a=np.random.randn(5, 7) b=np.random.randn(5, 7) I have found the t test : ttest_ind in statsmodels. However I would like to specify the difference to test for, this can not be passed into the ttest_ind function. Does anybody know another

nonlinear least-square with Python statsmodels

丶灬走出姿态 提交于 2019-12-12 16:27:17
问题 Within the Python library statsmodels, is it possible to perform a nonlinear least-square fitting with nonlinear parameter? In other words, I would like to find the best fit (in term of least-square) for p in the following stat model: y = ln(p)*x^2 + p Let's assume I have a set of observation x and y, I can use the function scipy.optimize.leastsq to find the best fit. Here is an example: import numpy as np import scipy.optimize as spopt import matplotlib.pyplot as plt def fitFunc(p,x): return

How to properly set start/end params of statsmodels.tsa.ar_model.AR.predict function

佐手、 提交于 2019-12-12 15:43:38
问题 I have a dataframe of project costs from an irregularly spaced time series that I would like to try to apply the statsmodel AR model against. This is a sample of the data in it's dataframe: cost date 2015-07-16 35.98 2015-08-11 25.00 2015-08-11 43.94 2015-08-13 26.25 2015-08-18 15.38 2015-08-24 77.72 2015-09-09 40.00 2015-09-09 20.00 2015-09-09 65.00 2015-09-23 70.50 2015-09-29 59.00 2015-11-03 19.25 2015-11-04 19.97 2015-11-10 26.25 2015-11-12 19.97 2015-11-12 23.97 2015-11-12 21.88 2015-11

Undo a Series Diff

a 夏天 提交于 2019-12-12 15:23:01
问题 I have a pandas Series with monthly data ( df.sales ). I needed to subtract the data 12 months earlier to fit a time series, so I ran this command: sales_new = df.sales.diff(periods=12) I then fit an ARMA model, and predicted the future: model = ARMA(sales_new, order=(2,0)).fit() model.predict('2015-01-01', '2017-01-01') Because I had diffed the sales data, when I use the model to predict, it predicts forward diffs. If this was diff of period 1, I would just use an np.cumsum() , but because

Statsmodels Poisson glm different than R

北战南征 提交于 2019-12-12 15:07:02
问题 I am trying to fit some models (Spatial interaction models) according to some code which is provided in R. I have been able to get some of the code to work using statsmodels in a python framework but some of them do not match at all. I believe that the code I have for R and Python should give identical results. Does anyone see any differences? Or is there some fundamental differences which might be throwing things off? The R code is the original code which matches the numbers given in a

Pandas Statsmodels ols regression prediction using DF predictor?

天大地大妈咪最大 提交于 2019-12-12 14:11:23
问题 Using Pandas OLS I am able to fit and use a model as follows: ols_test = pd.ols(y=merged2[:-1].Units, x=merged2[:-1].lastqu) #to exclude current year, then do forecast method yrahead=(ols_test.beta['x'] * merged2.lastqu[-1:]) + ols_test.beta['intercept'] I needed to switch to statsmodels to get some additional functionality (mainly the residual plots See(question here) So now I have: def fit_line2(x, y): X = sm.add_constant(x, prepend=True) #Add a column of ones to allow the calculation of

Can we generate contingency table for chisquare test using python?

最后都变了- 提交于 2019-12-12 11:51:48
问题 I am using scipy.stats.chi2_contingency method to get chi square statistics. We need to pass frequency table i.e. contingency table as parameter. But I have a feature vector and want to automatically generate the frequency table. Do we have any such function available? I am doing it like this currently: def contigency_matrix_categorical(data_series,target_series,target_val,indicator_val): observed_freq={} for targets in target_val: observed_freq[targets]={} for indicators in indicator_val:

Python Statsmodels: Using SARIMAX with exogenous regressors to get predicted mean and confidence intervals

强颜欢笑 提交于 2019-12-12 10:34:30
问题 I'm using statsmodels.tsa.SARIMAX() to train a model with exogenous variables. Is there an equivalent of get_prediction() when a model is trained with exogenous variables so that the object returned contains the predicted mean and confidence interval rather than just an array of predicted mean results? The predict() and forecast() methods take exogenous variables, but only return the predicted mean value. SARIMA_model = sm.tsa.SARIMAX(endog=y_train.astype('float64'), exog=ExogenousFeature

python 3.5 in statsmodels ImportError: cannot import name '_representation'

那年仲夏 提交于 2019-12-12 08:47:21
问题 I cannot manage to import statsmodels.api correctly when i do that I have this error: File "/home/mlv/.local/lib/python3.5/site-packages/statsmodels/tsa/statespace/tools.py", line 59, in set_mode from . import (_representation, _kalman_filter, _kalman_smoother, ImportError: cannot import name '_representation' I already try to re-install or update it, that does not change. plese i need help =) 回答1: Please see the github report for more detail. It turns out that statsmodels is dependent upon