statsmodels

Custom priors in PyMC

北城以北 提交于 2019-11-30 15:43:12
Say I want to put a custom prior on two variables a and b in PyMC, e.g.: p(a,b)∝(a+b)^(−5/2) (for the motivation behind this choice of prior, see this answer ) Can this be done in PyMC? If so how? As an example, I would like to define such prior on a and b in the model below. import pymc as pm # ... # Code that defines the prior: p(a,b)∝(a+b)^(−5/2) # ... theta = pm.Beta("prior", alpha=a, beta=b) # Binomials that share a common prior bins = dict() for i in xrange(N_cities): bins[i] = pm.Binomial('bin_{}'.format(i), p=theta,n=N_trials[i], value=N_yes[i], observed=True) mcmc = pm.MCMC([bins, ps]

Equivalent of R's of cor.test in Python

[亡魂溺海] 提交于 2019-11-30 13:25:41
问题 Is there a way I can find the r confidence interval in Python? In R i could do something like: cor.test(m, h) Pearson's product-moment correlation data: m and h t = 0.8974, df = 4, p-value = 0.4202 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.6022868 0.9164582 sample estimates: cor 0.4093729 In Python I can calculate r (cor) using: r,p = scipy.stats.pearsonr(df.age, df.pets) But that doesn't return the r confidence interval. 回答1: Here's one way

Python Statsmodel ARIMA start [stationarity]

若如初见. 提交于 2019-11-30 13:02:44
I just began working on time series analysis using statsmodels. I have a dataset with dates and values (for about 3 months). I am facing some issues with providing the right order to the ARIMA model. I am looking to adjust for trends and seasonality and then compute outliers. My 'values' are not stationary and statsmodel says that I have to either induce stationarity or provide some differencing to make it work. I played around with different ordering (without understanding deeply about the consequences of changing p,q and d). When I introduce 1 for differencing, I get this error: ValueError:

Difference in Python statsmodels OLS and R's lm

不问归期 提交于 2019-11-30 12:14:54
问题 I'm not sure why I'm getting slightly different results for a simple OLS, depending on whether I go through panda's experimental rpy interface to do the regression in R or whether I use statsmodels in Python. import pandas from rpy2.robjects import r from functools import partial loadcsv = partial(pandas.DataFrame.from_csv, index_col="seqn", parse_dates=False) demoq = loadcsv("csv/DEMO.csv") rxq = loadcsv("csv/quest/RXQ_RX.csv") num_rx = {} for seqn, num in rxq.rxd295.iteritems(): try: val =

ImportError: cannot import name 'factorial'

坚强是说给别人听的谎言 提交于 2019-11-30 11:46:49
I want to use a logit model and trying to import statsmodels library. My Version: Python 3.6.8 The best suggestion I got is to downgrade scipy but unclear how to and to what version should I downgrade. Please help how to resolve. https://github.com/statsmodels/statsmodels/issues/5747 import statsmodels.formula.api as smf ImportError Traceback (most recent call last) <ipython-input-52-f897a2d817de> in <module> ----> 1 import statsmodels.formula.api as smf ~/anaconda3/envs/py36/lib/python3.6/site-packages/statsmodels/formula/api.py in <module> 13 from statsmodels.robust.robust_linear_model

Newey-West standard errors for OLS in Python?

左心房为你撑大大i 提交于 2019-11-30 11:11:01
问题 I want to have a coefficient and Newey-West standard error associated with it. I am looking for Python library (ideally, but any working solutions is fine) that can do what the following R code is doing: library(sandwich) library(lmtest) a <- matrix(c(1,3,5,7,4,5,6,4,7,8,9)) b <- matrix(c(3,5,6,2,4,6,7,8,7,8,9)) temp.lm = lm(a ~ b) temp.summ <- summary(temp.lm) temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest)) print (temp.summ$coefficients) Result: Estimate Std. Error t

Equivalent of Stata macros in Python

爱⌒轻易说出口 提交于 2019-11-30 10:55:34
I am trying to use Python for statistical analysis. In Stata I can define local macros and expand them as necessary: program define reg2 syntax varlist(min=1 max=1), indepvars(string) results(string) if "`results'" == "y" { reg `varlist' `indepvars' } if "`results'" == "n" { qui reg `varlist' `indepvars' } end sysuse auto, clear So instead of: reg2 mpg, indepvars("weight foreign price") results("y") I could do: local options , indepvars(weight foreign price) results(y) reg2 mpg `options' Or even: local vars weight foreign price local options , indepvars(`vars') results(y) reg2 mpg `options'

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

匆匆过客 提交于 2019-11-30 09:03:01
I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se , etc.) However, I can't quite figure out how to get the t -tests on the coefficients to use these corrected standard errors. Is there a way to do this in the API, or do I have to do it manually? If the latter, can you suggest any guidance on how to do this with statsmodels results? The fit method of the linear models, discrete models and GLM, take a cov_type and a cov_kwds argument

Decomposing trend, seasonal and residual time series elements

萝らか妹 提交于 2019-11-30 08:26:37
I have a DataFrame with a few time series: divida movav12 var varmovav12 Date 2004-01 0 NaN NaN NaN 2004-02 0 NaN NaN NaN 2004-03 0 NaN NaN NaN 2004-04 34 NaN inf NaN 2004-05 30 NaN -0.117647 NaN 2004-06 44 NaN 0.466667 NaN 2004-07 35 NaN -0.204545 NaN 2004-08 31 NaN -0.114286 NaN 2004-09 30 NaN -0.032258 NaN 2004-10 24 NaN -0.200000 NaN 2004-11 41 NaN 0.708333 NaN 2004-12 29 24.833333 -0.292683 NaN 2005-01 31 27.416667 0.068966 0.104027 2005-02 28 29.750000 -0.096774 0.085106 2005-03 27 32.000000 -0.035714 0.075630 2005-04 30 31.666667 0.111111 -0.010417 2005-05 31 31.750000 0.033333 0.002632

python stats models - quadratic term in regression

喜你入骨 提交于 2019-11-30 05:11:54
I have the following linear regression: import statsmodels.formula.api as sm model = sm.ols(formula = 'a ~ b + c', data = data).fit() I want to add a quadratic term for b in this model. Is there a simple way to do this with statsmodels.ols? Is there a better package I should be using to achieve this? Although the solution by Alexander is working, in some situations it is not very convenient. For example, each time you want to predict the outcome of the model for new values, you need to remember to pass both b**2 and b values which is cumbersome and should not be necessary. Although patsy does