statsmodels

calculate coefficient of determination (R2) and root mean square error (RMSE) for non linear curve fitting in python

柔情痞子 提交于 2019-12-03 17:35:24
How to calculate coefficient of determination (R2) and root mean square error (RMSE) for non linear curve fitting in python. Following code does until curve fitting. Then how to calculate R2 and RMSE? import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit def func(x, a, b, c): return a * np.exp(-b * x) + c x = np.linspace(0,4,50) y = func(x, 2.5, 1.3, 0.5) yn = y + 0.2*np.random.normal(size=len(x)) popt, pcov = curve_fit(func, x, yn) plt.figure() plt.plot(x, yn, 'ko', label="Original Noised Data") plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve") plt

Python pandas has no attribute ols - Error (rolling OLS)

点点圈 提交于 2019-12-03 17:24:32
For my evaluation, I wanted to run a rolling 1000 window OLS regression estimation of the dataset found in this URL: https://drive.google.com/open?id=0B2Iv8dfU4fTUa3dPYW5tejA0bzg using the following Python script. # /usr/bin/python -tt import numpy as np import matplotlib.pyplot as plt import pandas as pd from statsmodels.formula.api import ols df = pd.read_csv('estimated.csv', names=('x','y')) model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['y']], window_type='rolling', window=1000, intercept=True) df['Y_hat'] = model.y_predict However, when I run my Python script, I am getting this error:

Using statsmodel estimations with scikit-learn cross validation, is it possible?

≡放荡痞女 提交于 2019-12-03 16:32:34
问题 I posted this question to Cross Validated forum and later realized may be this would find appropriate audience in stackoverlfow instead. I am looking for a way I can use the fit object (result) ontained from python statsmodel to feed into cross_val_score of scikit-learn cross_validation method? The attached link suggests that it may be possible but I have not succeeded. I am getting the following error estimator should a be an estimator implementing 'fit' method statsmodels.discrete.discrete

Statsmodels mosaic plot ValueError: cannot convert float NaN to integer

寵の児 提交于 2019-12-03 16:00:39
I have a simple pandas DataFrame, for which I would like to create a mosaic plot. Here is my code: import pandas as pd from statsmodels.graphics.mosaicplot import mosaic mydata = pd.DataFrame({'id2': {64: 'Angelica', 65: 'DXW_UID', 66: 'casuid01', 67: 'casuid01', 68: 'EC93_uid', 69: 'EC93_uid', 70: 'EC93_uid', 60: 'DXW_UID', 61: 'AtmosFox', 62: 'DXW_UID', 63: 'DXW_UID'}, 'id1': {64: 'TGP', 65: 'Retention01', 66: 'default', 67: 'default', 68: 'Musa_EC_9_3', 69: 'Musa_EC_9_3', 70: 'Musa_EC_9_3', 60: 'default', 61: 'default', 62: 'default', 63: 'default'}}) mydata id1 id2 60 default DXW_UID 61

Difference between the interaction : and * term for formulas in StatsModels OLS regression

落花浮王杯 提交于 2019-12-03 15:21:25
Hi I'm learning Statsmodel and can't figure out the difference between : and * (interaction terms) for formulas in StatsModels OLS regression. Could you please give me a hint to figure this out? Thank you! The documentation: http://statsmodels.sourceforge.net/devel/example_formulas.html Yaron ":" will give a regression without the level itself. just the interaction you have mentioned. "*" will give a regression with the level itself + the interaction you have mentioned. for example a . GLMmodel = glm("y ~ a: b" , data = df) you'll have only one independent variable which is the results of "a"

VAR model with pandas + statsmodels in Python

不羁的心 提交于 2019-12-03 15:14:24
I am an avid user of R, but recently switched to Python for a few different reasons. However, I am struggling a little to run the vector AR model in Python from statsmodels. Q#1. I get an error when I run this, and I have a suspicion it has something to do with the type of my vector. import numpy as np import statsmodels.tsa.api from statsmodels import datasets import datetime as dt import pandas as pd from pandas import Series from pandas import DataFrame import os df = pd.read_csv('myfile.csv') speedonly = DataFrame(df['speed']) results = statsmodels.tsa.api.VAR(speedonly) Traceback (most

Fama Macbeth Regression in Python (Pandas or Statsmodels)

妖精的绣舞 提交于 2019-12-03 15:10:49
Econometric Backgroud Fama Macbeth regression refers to a procedure to run regression for panel data (where there are N different individuals and each individual corresponds to multiple periods T, e.g. day, months,year). So in total there are N x T obs. Notice it's OK if the panel data is not balanced. The Fama Macbeth regression is to first run regression for each period cross-sectinally, i.e. pool N individuals together in a given period t. And do this for t=1,...T. So in total T regressions are run. Then we have a time series of coefficients for each independent variable. Then we can

Confidence interval for LOWESS in Python

喜欢而已 提交于 2019-12-03 14:46:40
问题 How would I calculate the confidence intervals for a LOWESS regression in Python? I would like to add these as a shaded region to the LOESS plot created with the following code (other packages than statsmodels are fine as well). import numpy as np import pylab as plt import statsmodels.api as sm x = np.linspace(0,2*np.pi,100) y = np.sin(x) + np.random.random(100) * 0.2 lowess = sm.nonparametric.lowess(y, x, frac=0.1) plt.plot(x, y, '+') plt.plot(lowess[:, 0], lowess[:, 1]) plt.show() I've

Forecasting with statsmodels

痞子三分冷 提交于 2019-12-03 13:18:30
I have a .csv file containing a 5-year time series, with hourly resolution (commoditiy price). Based on the historical data, I want to create a forecast of the prices for the 6th year. I have read a couple of articles on the www about these type of procedures, and I basically based my code on the code posted there, since my knowledge in both Python (especially statsmodels) and statistic is at most limited. Those are the links, for those who are interested: http://www.seanabu.com/2016/03/22/time-series-seasonal-ARIMA-model-in-python/ http://www.johnwittenauer.net/a-simple-time-series-analysis

Using statsmodel estimations with scikit-learn cross validation, is it possible?

亡梦爱人 提交于 2019-12-03 12:22:01
I posted this question to Cross Validated forum and later realized may be this would find appropriate audience in stackoverlfow instead. I am looking for a way I can use the fit object (result) ontained from python statsmodel to feed into cross_val_score of scikit-learn cross_validation method? The attached link suggests that it may be possible but I have not succeeded. I am getting the following error estimator should a be an estimator implementing 'fit' method statsmodels.discrete.discrete_model.BinaryResultsWrapper object at 0x7fa6e801c590 was passed Refer this link Indeed, you cannot use