statsmodels | 易学教程

statsmodels forecasting using ARMA model

阅读更多关于 statsmodels forecasting using ARMA model

问题 I want to forecast timeseries data. I read in previous posts that module statsmodels has the required tool for using ARMA method for forecasting which is exactly the one I have been looking for. In spite of that I am having trouble in forecasting the data. Can someone explain the various parameters used in the model and/or provide a sample example? 回答1: The question is very general, for background information Rob Hyndman's link or any text book for time series analysis will be useful. Skipper

Predicting out future values using OLS regression (Python, StatsModels, Pandas)

阅读更多关于 Predicting out future values using OLS regression (Python, StatsModels, Pandas)

I'm currently trying to implement a MLR in Python and am not sure how I go about applying the coefficients I've found to future values. import pandas as pd import statsmodels.formula.api as sm import statsmodels.api as sm2 TV = [230.1, 44.5, 17.2, 151.5, 180.8] Radio = [37.8,39.3,45.9,41.3,10.8] Newspaper = [69.2,45.1,69.3,58.5,58.4] Sales = [22.1, 10.4, 9.3, 18.5,12.9] df = pd.DataFrame({'TV': TV, 'Radio': Radio, 'Newspaper': Newspaper, 'Sales': Sales}) Y = df.Sales X = df[['TV','Radio','Newspaper']] X = sm2.add_constant(X) model = sm.OLS(Y, X).fit() >>> model.params const -0.141990 TV 0

Pandas with Fixed Effects

阅读更多关于 Pandas with Fixed Effects

I'm using Pandas on Python 2.7. I have data with the following columns: State, Year, UnempRate, Wage I'm teaching a course on how to use Python for research. As the culmination of our project, I want to run a regression of UnempRate on Wage controlling for State and Year fixed effects. I can do this with creation of dummies for states and year and then: ols(y=df['UnempRate'],x=df[FullDummyList]) Is there an easier way to do this? I was trying to use the PanelOLS method mentioned here: Fixed effect in Pandas or Statsmodels But I can't seem to get the syntax right, or find more documentation on

Python Statsmodels x13_arima_analysis : AttributeError: 'dict' object has no attribute 'iteritems'

阅读更多关于 Python Statsmodels x13_arima_analysis : AttributeError: 'dict' object has no attribute 'iteritems'

Step 1 : My sample data import pandas as pd from pandas import Timestamp s = pd.Series( {Timestamp('2013-03-01 00:00:00'): 838.2, Timestamp('2013-04-01 00:00:00'): 865.17, Timestamp('2013-05-01 00:00:00'): 763.0, Timestamp('2013-06-01 00:00:00'): 802.99, Timestamp('2013-07-01 00:00:00'): 875.56, Timestamp('2013-08-01 00:00:00'): 754.4, Timestamp('2013-09-01 00:00:00'): 617.48, Timestamp('2013-10-01 00:00:00'): 994.75, Timestamp('2013-11-01 00:00:00'): 860.86, Timestamp('2013-12-01 00:00:00'): 786.66, Timestamp('2014-01-01 00:00:00'): 908.48, Timestamp('2014-02-01 00:00:00'): 980.88, Timestamp(

Unexpected standard errors with weighted least squares in Python Pandas

阅读更多关于 Unexpected standard errors with weighted least squares in Python Pandas

In the code for the main OLS class in Python Pandas , I am looking for help to clarify what conventions are used for the standard error and t-stats reported when weighted OLS is performed. Here's my example data set, with some imports to use Pandas and to use scikits.statsmodels WLS directly: import pandas import numpy as np from statsmodels.regression.linear_model import WLS # Make some random data. np.random.seed(42) df = pd.DataFrame(np.random.randn(10, 3), columns=['a', 'b', 'weights']) # Add an intercept term for direct use in WLS df['intercept'] = 1 # Add a number (I picked 10) to

Plotting Historical Cointegration Values between two pairs

阅读更多关于 Plotting Historical Cointegration Values between two pairs

问题 Here is the sample ADF test in python to check for Cointegration between two pairs. However the final result gives only the numeric value for co-integration. How to get the historical results of Co-integration. Taken from http://www.leinenbock.com/adf-test-in-python/ import numpy as np import statsmodels.api as stat import statsmodels.tsa.stattools as ts x = np.random.normal(0,1, 1000) y = np.random.normal(0,1, 1000) def cointegration_test(y, x): result = stat.OLS(y, x).fit() return ts

Calculate logistic regression in python

阅读更多关于 Calculate logistic regression in python

问题 I tried to calculate logical regression. I have the data as csv file. it looks like node_id,second_major,gender,major_index,year,dorm,high_school,student_fac 0,0,2,257,2007,111,2849,1 1,0,2,271,2005,0,51195,2 2,0,2,269,2007,0,21462,1 3,269,1,245,2008,111,2597,1 .......................... This is my coding. import pandas as pd import statsmodels.api as sm import pylab as pl import numpy as np df = pd.read_csv("Reed98.csv") print df.describe() dummy_ranks = pd.get_dummies(df['second_major'],

Correct way to use ARMAResult.predict() function

阅读更多关于 Correct way to use ARMAResult.predict() function

问题 According to this question How to get constant term in AR Model with statsmodels and Python?. I'm now trying to use the ARMA model to fit the data but again I couldn't find a way to interpret the model's result. Here what I have done according to ARMA out-of-sample prediction with statsmodels and ARMAResults.predict API document. # Parameter INPUT_DATA_POINT = 200 P = 5 Q = 0 # Read Data data = [] f = open('stock_all.csv', 'r') for line in f: data.append(float(line.split(',')[5])) f.close() #

Holt-Winters time series forecasting with statsmodels

阅读更多关于 Holt-Winters time series forecasting with statsmodels

问题 I tried forecasting with holt-winters model as shown below but I keep getting a prediction that is not consistent with what I expect. I also showed a visualization of the plot Train = Airline[:130] Test = Airline[129:] from statsmodels.tsa.holtwinters import Holt y_hat_avg = Test.copy() fit1 = Holt(np.asarray(Train['Passengers'])).fit() y_hat_avg['Holt_Winter'] = fit1.predict(start=1,end=15) plt.figure(figsize=(16,8)) plt.plot(Train.index, Train['Passengers'], label='Train') plt.plot(Test

Plotting Pandas OLS linear regression results

阅读更多关于 Plotting Pandas OLS linear regression results

How would I plot my linear regression results for this linear regression I did from pandas? import pandas as pd from pandas.stats.api import ols df = pd.read_csv('Samples.csv', index_col=0) control = ols(y=df['Control'], x=df['Day']) one = ols(y=df['Sample1'], x=df['Day']) two = ols(y=df['Sample2'], x=df['Day']) I tried plot() but it did not work. I want to plot all three samples on one plot are there any pandas code or matplotlib code to hadle data in the format of these summaries? Anyways the results look like this: Control ------------------------Summary of Regression Analysis--------------