statsmodels | 易学教程

Custom priors in PyMC

阅读更多关于 Custom priors in PyMC

Say I want to put a custom prior on two variables a and b in PyMC, e.g.: p(a,b)∝(a+b)^(−5/2) (for the motivation behind this choice of prior, see this answer ) Can this be done in PyMC? If so how? As an example, I would like to define such prior on a and b in the model below. import pymc as pm # ... # Code that defines the prior: p(a,b)∝(a+b)^(−5/2) # ... theta = pm.Beta("prior", alpha=a, beta=b) # Binomials that share a common prior bins = dict() for i in xrange(N_cities): bins[i] = pm.Binomial('bin_{}'.format(i), p=theta,n=N_trials[i], value=N_yes[i], observed=True) mcmc = pm.MCMC([bins, ps]

Equivalent of R's of cor.test in Python

阅读更多关于 Equivalent of R's of cor.test in Python

问题 Is there a way I can find the r confidence interval in Python? In R i could do something like: cor.test(m, h) Pearson's product-moment correlation data: m and h t = 0.8974, df = 4, p-value = 0.4202 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.6022868 0.9164582 sample estimates: cor 0.4093729 In Python I can calculate r (cor) using: r,p = scipy.stats.pearsonr(df.age, df.pets) But that doesn't return the r confidence interval. 回答1: Here's one way

Python Statsmodel ARIMA start [stationarity]

阅读更多关于 Python Statsmodel ARIMA start [stationarity]

I just began working on time series analysis using statsmodels. I have a dataset with dates and values (for about 3 months). I am facing some issues with providing the right order to the ARIMA model. I am looking to adjust for trends and seasonality and then compute outliers. My 'values' are not stationary and statsmodel says that I have to either induce stationarity or provide some differencing to make it work. I played around with different ordering (without understanding deeply about the consequences of changing p,q and d). When I introduce 1 for differencing, I get this error: ValueError:

Difference in Python statsmodels OLS and R's lm

阅读更多关于 Difference in Python statsmodels OLS and R's lm

问题 I'm not sure why I'm getting slightly different results for a simple OLS, depending on whether I go through panda's experimental rpy interface to do the regression in R or whether I use statsmodels in Python. import pandas from rpy2.robjects import r from functools import partial loadcsv = partial(pandas.DataFrame.from_csv, index_col="seqn", parse_dates=False) demoq = loadcsv("csv/DEMO.csv") rxq = loadcsv("csv/quest/RXQ_RX.csv") num_rx = {} for seqn, num in rxq.rxd295.iteritems(): try: val =

ImportError: cannot import name 'factorial'

阅读更多关于 ImportError: cannot import name 'factorial'

I want to use a logit model and trying to import statsmodels library. My Version: Python 3.6.8 The best suggestion I got is to downgrade scipy but unclear how to and to what version should I downgrade. Please help how to resolve. https://github.com/statsmodels/statsmodels/issues/5747 import statsmodels.formula.api as smf ImportError Traceback (most recent call last) <ipython-input-52-f897a2d817de> in <module> ----> 1 import statsmodels.formula.api as smf ~/anaconda3/envs/py36/lib/python3.6/site-packages/statsmodels/formula/api.py in <module> 13 from statsmodels.robust.robust_linear_model

Newey-West standard errors for OLS in Python?

阅读更多关于 Newey-West standard errors for OLS in Python?

问题 I want to have a coefficient and Newey-West standard error associated with it. I am looking for Python library (ideally, but any working solutions is fine) that can do what the following R code is doing: library(sandwich) library(lmtest) a <- matrix(c(1,3,5,7,4,5,6,4,7,8,9)) b <- matrix(c(3,5,6,2,4,6,7,8,7,8,9)) temp.lm = lm(a ~ b) temp.summ <- summary(temp.lm) temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest)) print (temp.summ$coefficients) Result: Estimate Std. Error t

Equivalent of Stata macros in Python

阅读更多关于 Equivalent of Stata macros in Python

I am trying to use Python for statistical analysis. In Stata I can define local macros and expand them as necessary: program define reg2 syntax varlist(min=1 max=1), indepvars(string) results(string) if "`results'" == "y" { reg `varlist' `indepvars' } if "`results'" == "n" { qui reg `varlist' `indepvars' } end sysuse auto, clear So instead of: reg2 mpg, indepvars("weight foreign price") results("y") I could do: local options , indepvars(weight foreign price) results(y) reg2 mpg `options' Or even: local vars weight foreign price local options , indepvars(`vars') results(y) reg2 mpg `options'

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

阅读更多关于 Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se , etc.) However, I can't quite figure out how to get the t -tests on the coefficients to use these corrected standard errors. Is there a way to do this in the API, or do I have to do it manually? If the latter, can you suggest any guidance on how to do this with statsmodels results? The fit method of the linear models, discrete models and GLM, take a cov_type and a cov_kwds argument

Decomposing trend, seasonal and residual time series elements

阅读更多关于 Decomposing trend, seasonal and residual time series elements

I have a DataFrame with a few time series: divida movav12 var varmovav12 Date 2004-01 0 NaN NaN NaN 2004-02 0 NaN NaN NaN 2004-03 0 NaN NaN NaN 2004-04 34 NaN inf NaN 2004-05 30 NaN -0.117647 NaN 2004-06 44 NaN 0.466667 NaN 2004-07 35 NaN -0.204545 NaN 2004-08 31 NaN -0.114286 NaN 2004-09 30 NaN -0.032258 NaN 2004-10 24 NaN -0.200000 NaN 2004-11 41 NaN 0.708333 NaN 2004-12 29 24.833333 -0.292683 NaN 2005-01 31 27.416667 0.068966 0.104027 2005-02 28 29.750000 -0.096774 0.085106 2005-03 27 32.000000 -0.035714 0.075630 2005-04 30 31.666667 0.111111 -0.010417 2005-05 31 31.750000 0.033333 0.002632

python stats models - quadratic term in regression

阅读更多关于 python stats models - quadratic term in regression

I have the following linear regression: import statsmodels.formula.api as sm model = sm.ols(formula = 'a ~ b + c', data = data).fit() I want to add a quadratic term for b in this model. Is there a simple way to do this with statsmodels.ols? Is there a better package I should be using to achieve this? Although the solution by Alexander is working, in some situations it is not very convenient. For example, each time you want to predict the outcome of the model for new values, you need to remember to pass both b**2 and b values which is cumbersome and should not be necessary. Although patsy does