statsmodels

statsmodels forecasting using ARMA model

∥☆過路亽.° 提交于 2019-12-04 18:09:22
问题 I want to forecast timeseries data. I read in previous posts that module statsmodels has the required tool for using ARMA method for forecasting which is exactly the one I have been looking for. In spite of that I am having trouble in forecasting the data. Can someone explain the various parameters used in the model and/or provide a sample example? 回答1: The question is very general, for background information Rob Hyndman's link or any text book for time series analysis will be useful. Skipper

Predicting out future values using OLS regression (Python, StatsModels, Pandas)

天涯浪子 提交于 2019-12-04 16:40:48
I'm currently trying to implement a MLR in Python and am not sure how I go about applying the coefficients I've found to future values. import pandas as pd import statsmodels.formula.api as sm import statsmodels.api as sm2 TV = [230.1, 44.5, 17.2, 151.5, 180.8] Radio = [37.8,39.3,45.9,41.3,10.8] Newspaper = [69.2,45.1,69.3,58.5,58.4] Sales = [22.1, 10.4, 9.3, 18.5,12.9] df = pd.DataFrame({'TV': TV, 'Radio': Radio, 'Newspaper': Newspaper, 'Sales': Sales}) Y = df.Sales X = df[['TV','Radio','Newspaper']] X = sm2.add_constant(X) model = sm.OLS(Y, X).fit() >>> model.params const -0.141990 TV 0

Pandas with Fixed Effects

假如想象 提交于 2019-12-04 15:48:04
I'm using Pandas on Python 2.7. I have data with the following columns: State, Year, UnempRate, Wage I'm teaching a course on how to use Python for research. As the culmination of our project, I want to run a regression of UnempRate on Wage controlling for State and Year fixed effects. I can do this with creation of dummies for states and year and then: ols(y=df['UnempRate'],x=df[FullDummyList]) Is there an easier way to do this? I was trying to use the PanelOLS method mentioned here: Fixed effect in Pandas or Statsmodels But I can't seem to get the syntax right, or find more documentation on

Python Statsmodels x13_arima_analysis : AttributeError: 'dict' object has no attribute 'iteritems'

柔情痞子 提交于 2019-12-04 15:28:27
Step 1 : My sample data import pandas as pd from pandas import Timestamp s = pd.Series( {Timestamp('2013-03-01 00:00:00'): 838.2, Timestamp('2013-04-01 00:00:00'): 865.17, Timestamp('2013-05-01 00:00:00'): 763.0, Timestamp('2013-06-01 00:00:00'): 802.99, Timestamp('2013-07-01 00:00:00'): 875.56, Timestamp('2013-08-01 00:00:00'): 754.4, Timestamp('2013-09-01 00:00:00'): 617.48, Timestamp('2013-10-01 00:00:00'): 994.75, Timestamp('2013-11-01 00:00:00'): 860.86, Timestamp('2013-12-01 00:00:00'): 786.66, Timestamp('2014-01-01 00:00:00'): 908.48, Timestamp('2014-02-01 00:00:00'): 980.88, Timestamp(

Unexpected standard errors with weighted least squares in Python Pandas

笑着哭i 提交于 2019-12-04 15:20:03
In the code for the main OLS class in Python Pandas , I am looking for help to clarify what conventions are used for the standard error and t-stats reported when weighted OLS is performed. Here's my example data set, with some imports to use Pandas and to use scikits.statsmodels WLS directly: import pandas import numpy as np from statsmodels.regression.linear_model import WLS # Make some random data. np.random.seed(42) df = pd.DataFrame(np.random.randn(10, 3), columns=['a', 'b', 'weights']) # Add an intercept term for direct use in WLS df['intercept'] = 1 # Add a number (I picked 10) to

Plotting Historical Cointegration Values between two pairs

倾然丶 夕夏残阳落幕 提交于 2019-12-04 14:48:02
问题 Here is the sample ADF test in python to check for Cointegration between two pairs. However the final result gives only the numeric value for co-integration. How to get the historical results of Co-integration. Taken from http://www.leinenbock.com/adf-test-in-python/ import numpy as np import statsmodels.api as stat import statsmodels.tsa.stattools as ts x = np.random.normal(0,1, 1000) y = np.random.normal(0,1, 1000) def cointegration_test(y, x): result = stat.OLS(y, x).fit() return ts

Calculate logistic regression in python

一曲冷凌霜 提交于 2019-12-04 12:01:31
问题 I tried to calculate logical regression. I have the data as csv file. it looks like node_id,second_major,gender,major_index,year,dorm,high_school,student_fac 0,0,2,257,2007,111,2849,1 1,0,2,271,2005,0,51195,2 2,0,2,269,2007,0,21462,1 3,269,1,245,2008,111,2597,1 .......................... This is my coding. import pandas as pd import statsmodels.api as sm import pylab as pl import numpy as np df = pd.read_csv("Reed98.csv") print df.describe() dummy_ranks = pd.get_dummies(df['second_major'],

Correct way to use ARMAResult.predict() function

跟風遠走 提交于 2019-12-04 11:07:57
问题 According to this question How to get constant term in AR Model with statsmodels and Python?. I'm now trying to use the ARMA model to fit the data but again I couldn't find a way to interpret the model's result. Here what I have done according to ARMA out-of-sample prediction with statsmodels and ARMAResults.predict API document. # Parameter INPUT_DATA_POINT = 200 P = 5 Q = 0 # Read Data data = [] f = open('stock_all.csv', 'r') for line in f: data.append(float(line.split(',')[5])) f.close() #

Holt-Winters time series forecasting with statsmodels

半腔热情 提交于 2019-12-04 11:05:46
问题 I tried forecasting with holt-winters model as shown below but I keep getting a prediction that is not consistent with what I expect. I also showed a visualization of the plot Train = Airline[:130] Test = Airline[129:] from statsmodels.tsa.holtwinters import Holt y_hat_avg = Test.copy() fit1 = Holt(np.asarray(Train['Passengers'])).fit() y_hat_avg['Holt_Winter'] = fit1.predict(start=1,end=15) plt.figure(figsize=(16,8)) plt.plot(Train.index, Train['Passengers'], label='Train') plt.plot(Test

Plotting Pandas OLS linear regression results

南楼画角 提交于 2019-12-04 08:48:27
How would I plot my linear regression results for this linear regression I did from pandas? import pandas as pd from pandas.stats.api import ols df = pd.read_csv('Samples.csv', index_col=0) control = ols(y=df['Control'], x=df['Day']) one = ols(y=df['Sample1'], x=df['Day']) two = ols(y=df['Sample2'], x=df['Day']) I tried plot() but it did not work. I want to plot all three samples on one plot are there any pandas code or matplotlib code to hadle data in the format of these summaries? Anyways the results look like this: Control ------------------------Summary of Regression Analysis--------------