statsmodels | 易学教程

Adding statsmodels 'predict' results to a Pandas dataframe

阅读更多关于 Adding statsmodels 'predict' results to a Pandas dataframe

问题 It is common to want to append the results of predictions to the dataset used to make the predictions, but the statsmodels predict function returns (non-indexed) results of a potentially different length than the dataset on which predictions are based. For example, if the test dataset, test , contains any null entries, then mod_fit = sm.Logit.from_formula('Y ~ A B C', train).fit() press = mod_fit.predict(test) will produce an array that is shorter than the length of test , and cannot be

How to add sum to zero constraint to GLM in Python?

阅读更多关于 How to add sum to zero constraint to GLM in Python?

问题 I have a model set up in Python using the statsmodel glm function but now I want to add a sum to zero constraint to the model. The model is defined as follows: import statsmodels.formula.api as smf model = smf.glm(formula="A ~ B + C + D", data=data, family=sm.families.Poisson()).fit() In R, to add the constraint, I would simply do something like this: model <- glm(A ~ B + C + D –1, family=poisson(), data=data, contrasts=list(C="contr.sum", D="contr.sum")) That adds the sum to zero constraint

Weighted standard deviation in NumPy

阅读更多关于 Weighted standard deviation in NumPy

问题 numpy.average() has a weights option, but numpy.std() does not. Does anyone have suggestions for a workaround? 回答1: How about the following short "manual calculation"? def weighted_avg_and_std(values, weights): """ Return the weighted average and standard deviation. values, weights -- Numpy ndarrays with the same shape. """ average = numpy.average(values, weights=weights) # Fast and numerically precise: variance = numpy.average((values-average)**2, weights=weights) return (average, math.sqrt

Difference in SGD classifier results and statsmodels results for logistic with l1

阅读更多关于 Difference in SGD classifier results and statsmodels results for logistic with l1

问题 As a check on my work, I've been comparing the output of scikit learn's SGDClassifier logistic implementation with statsmodels logistic. Once I add some l1 in combination with categorical variables, I'm getting very different results. Is this a result of different solution techniques or am I not using the correct parameter? Much bigger differences on my own dataset, but still pretty large using mtcars: df = sm.datasets.get_rdataset("mtcars", "datasets").data y, X = patsy.dmatrices('am

Comparison of results from statsmodels ARIMA with original data

阅读更多关于 Comparison of results from statsmodels ARIMA with original data

问题 I have a time series with seasonal components. I fitted the statsmodels ARIMA with model = tsa.arima_model.ARIMA(data, (8,1,0)).fit() For example. Now, I understand that ARIMA differences my data. How can I compare the results from prediction = model.predict() fig, ax = plt.subplots() data.plot() prediction.plot() as data will be the original data and prediction is differenced, and so has a mean around 0, different from the mean of data? 回答1: As the documentation shows, if the keyword typ is

module 'statsmodels.tsa.arima_model' has no arguments 'seasonal', 'xreg', 'xtransf', 'transfer' and 'include.mean'

阅读更多关于 module 'statsmodels.tsa.arima_model' has no arguments 'seasonal', 'xreg', 'xtransf', 'transfer' and 'include.mean'

问题 I'm trying to rebuild a ARIMA by python('statsmodels.tsa.arima_model') (had build in r by arima). The question is, there is no similar arguments('seasonal', 'xreg', 'xtransf', 'transfer' and 'include.mean') in python to make it work as in r, so anyone could teach me? thanks! 来源： https://stackoverflow.com/questions/59046327/module-statsmodels-tsa-arima-model-has-no-arguments-seasonal-xreg-xtran

statsmodels — weights in robust linear regression

阅读更多关于 statsmodels — weights in robust linear regression

问题 I was looking at the robust linear regression in statsmodels and I couldn't find a way to specify the "weights" of this regression. For example in least square regression assigning weights to each observation. Similar to what WLS does in statsmodels. Or is there a way to get around it? http://www.statsmodels.org/dev/rlm.html 回答1: RLM currently does not allow user specified weights. Weights are internally used to implement the reweighted least squares fitting method. If the weights have the

Python out of sample forecasting ARIMA predict()

阅读更多关于 Python out of sample forecasting ARIMA predict()

问题 Does statsmodels.api.tsa.ARIMA(mylist, (p,d,q)).fit().predict(start, end) only work for d=0?... myList is a list of 72 decimals all >0, p=2, d=1, q=1, start=72, end=12 and the majority of the forecasts are negative decimal numbers which leads me to believe statsmodels doesn't automatically undifference after performing the forecasts. 回答1: See the typ keyword of predict in the docstring. It determines whether you get predictions in terms of differences or levels. The default is 'linear'

statsmodles AR model error when calling params

阅读更多关于 statsmodles AR model error when calling params

问题 New to statsmodels, trying to use statsmodels.tsa.ar_model to fit a pandas timeseries. #pull one series from dataframe y=data.sentiment armodel=sm.tsa.ar_model.AR(y, freq='D').fit() armodel.params() gets the following error: C:\Python27\lib\site-packages\pandas\lib.pyd in pandas.lib.SeriesIndex.__set__ (pandas\lib.c:27817)() AssertionError: Index length did not match values Any ideas? 回答1: You should upgrade to current master, if you can. This was fixed here. 来源： https://stackoverflow.com

Statsmodels Python - Weighted GLM

阅读更多关于 Statsmodels Python - Weighted GLM

问题 I am currently working with significantly imbalanced data using the statsmodel package GLM (Or the separate logit function if need be). Thus far I have not found a way to implement instance weighting in these methods, however I heard that the current dev release of 0.7 may have this functionality. 1) Is there a way to implement sample weighting in the current stable release 2) If not has the current 0.7-dev release implemented this feature yet? While I know I can manually over/under sample