regression

Repeat regression with varying dependent variable

巧了我就是萌 提交于 2019-11-29 16:03:07
I've searched both Stack and google for a solution, none found to solve my problem. I have about 40 dependent variables, for which I aim to obtain adjusted means (lsmeans). I need adjusted means for group A and group B, after accounting for some covariates. My final object should be a data frame with predicted means for all 40 dependent variables for group A and group B. This is what I tried, without any success: # Examplified here with 2 outcome variables outcome1 <- c(2, 4, 6, 8, 10, 12, 14, 16) outcome2 <- c(1, 2, 3, 4, 5, 6, 7, 8) var1 <- c("a", "a", "a", "a", "b", "b", "b", "b") var2 <- c

How to update `lm` or `glm` model on same subset of data?

北城以北 提交于 2019-11-29 16:00:45
I am trying to fit two nested models and then test those against each other using anova function. The commands used are: probit <- glm(grad ~ afqt1 + fhgc + mhgc + hisp + black + male, data=dt, family=binomial(link = "probit")) nprobit <- update(probit, . ~ . - afqt1) anova(nprobit, probit, test="Rao") However, the variable afqt1 apparently contains NA s and because the update call does not take the same subset of data, anova() returns error Error in anova.glmlist(c(list(object), dotargs), dispersion = dispersion, : models were not all fitted to the same size of dataset Is there a simple way

Recoding dummy variable to ordered factor

前提是你 提交于 2019-11-29 14:58:26
I need some help with coding factors for a logistic regression. What I have are six dummy variables representing income brackets. I want to convert these into a single ordered factor for use in a logistic regression. My data frame looks like: INC1 INC2 INC3 INC4 INC5 INC6 1 0 0 1 0 0 0 2 NA NA NA NA NA NA 3 0 0 0 0 0 1 4 0 0 0 0 0 1 5 0 0 1 0 0 0 6 0 0 0 1 0 0 7 0 0 1 0 0 0 8 0 0 0 1 0 0 What I want it to look like: INC 1 INC3 2 NA 3 INC6 4 INC6 5 INC3 6 INC4 7 INC3 8 INC4 This must be a common (and simple) operation, but my searches have not turned up a concise answer for how to perform this

Fama MacBeth standard errors in R

一曲冷凌霜 提交于 2019-11-29 14:56:55
问题 Does anyone know if there is a package that would run Fama-MacBeth regressions in R and calculate the standard errors? I am aware of the sandwich package and its ability to estimate Newey-West standard errors, as well as providing functions for clustering. However, I have not seen anything with respect to Fama-MacBeth. 回答1: The plm package can estimate Fama-MacBeth regressions and SEs. require(foreign) require(plm) require(lmtest) test <- read.dta("http://www.kellogg.northwestern.edu/faculty

Obtain standard errors of regression coefficients for an “mlm” object returned by `lm()`

萝らか妹 提交于 2019-11-29 14:50:12
I'd like to run 10 regressions against the same regressor, then pull all the standard errors without using a loop . depVars <- as.matrix(data[,1:10]) # multiple dependent variables regressor <- as.matrix([,11]) # independent variable allModels <- lm(depVars ~ regressor) # multiple, single variable regressions summary(allModels)[1] # Can "view" the standard error for 1st regression, but can't extract... allModels is stored as an "mlm" object, which is really tough to work with. It'd be great if I could store a list of lm objects or a matrix with statistics of interest. Again, the objective is

Python Pandas: how to turn a DataFrame with “factors” into a design matrix for linear regression?

ε祈祈猫儿з 提交于 2019-11-29 14:03:38
问题 If memory servies me, in R there is a data type called factor which when used within a DataFrame can be automatically unpacked into the necessary columns of a regression design matrix. For example, a factor containing True/False/Maybe values would be transformed into: 1 0 0 0 1 0 or 0 0 1 for the purpose of using lower level regression code. Is there a way to achieve something similar using the pandas library? I see that there is some regression support within Pandas, but since I have my own

Error in dataframe *tmp* replacement has x data has y

怎甘沉沦 提交于 2019-11-29 13:46:18
I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term: # Create variables for child's EA: dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T) dat$cldeacu <- rowMeans(dat[,c('cucresp', 'cucinv')],na.rm=T) # Create a residual score for child EA: dat$cldearesid <- resid(lm(cldeacu ~ cldeacdi, data = dat)) I'm getting the following message: Error in `$<-.data.frame`(`*tmp*`, cldearesid, value = c(-0.18608488908881, : replacement has 366 rows, data has 367 I searched for this error but couldn't find anything that could resolve this. Additionally, I've

Block bootstrap from subject list

一个人想着一个人 提交于 2019-11-29 13:16:30
I'm trying to efficiently implement a block bootstrap technique to get the distribution of regression coefficients. The main outline is as follows. I have a panel data set, and say firm and year are the indices. For each iteration of the bootstrap, I wish to sample n subjects with replacement. From this sample, I need to construct a new data frame that is an rbind() stack of all the observations for each sampled subject, run the regression, and pull out the coefficients. Repeat for a bunch of iterations, say 100. Each firm can potentially be selected multiple times, so I need to include it

model.matrix(): why do I lose control of contrast in this case

谁说胖子不能爱 提交于 2019-11-29 12:46:31
Suppose we have a toy data frame: x <- data.frame(x1 = gl(3, 2, labels = letters[1:3]), x2 = gl(3, 2, labels = LETTERS[1:3])) I would like to construct a model matrix # x1b x1c x2B x2C # 1 0 0 0 0 # 2 0 0 0 0 # 3 1 0 1 0 # 4 1 0 1 0 # 5 0 1 0 1 # 6 0 1 0 1 by: model.matrix(~ x1 + x2 - 1, data = x, contrasts.arg = list(x1 = contr.treatment(letters[1:3]), x2 = contr.treatment(LETTERS[1:3]))) but actually I get: # x1a x1b x1c x2B x2C # 1 1 0 0 0 0 # 2 1 0 0 0 0 # 3 0 1 0 1 0 # 4 0 1 0 1 0 # 5 0 0 1 0 1 # 6 0 0 1 0 1 # attr(,"assign") # [1] 1 1 1 2 2 # attr(,"contrasts") # attr(,"contrasts")$x1 #

Rolling regression xts object in R

你说的曾经没有我的故事 提交于 2019-11-29 12:10:01
I am attempting to perform a rolling 100 day regression on an xts object and return the t statistic of the slope coefficient for all dates. I have an xts object, prices: > tail(prices) DBC EEM EFA GLD HYG IEF IWM IYR MDY TLT 2012-11-02 27.14 41.60 53.69 162.60 92.41 107.62 81.19 64.50 179.99 122.26 2012-11-05 27.37 41.80 53.56 163.23 92.26 107.88 81.73 64.02 181.10 122.95 2012-11-06 27.86 42.13 54.07 166.30 92.40 107.39 82.34 64.16 182.69 121.79 2012-11-07 27.34 41.44 53.26 166.49 91.85 108.29 80.34 63.84 178.90 124.00 2012-11-08 27.38 40.92 52.78 167.99 91.55 108.77 79.21 63.19 176.37 125.84