glm

How to repeat a process N times?

北城以北 提交于 2020-08-24 03:43:05
问题 I have: x = rnorm(100) # Partie b z = rbinom(100,1,0.60) # Partie c y = 1.4 + 0.7*x - 0.5*z # Partie d x1 = abs(x) y1 = abs(y) Don<-cbind(y1,x1,z) Don1 <- data.frame(Don) Reg <- glm(y1~x1+z,family=poisson(link="log"),Don1) # Partie e #Biais de beta Reg.cf <- coef(Reg) biais0 = Reg.cf[1] - 1.4 biais1 = Reg.cf[2] - 0.7 biais2 = Reg.cf[3] + 0.5 And I need to repeat all this 100 times in order to have different coefficient and calculate the bias and then put the mean of each biais in a text file.

Using geom_smooth for fitting a glm to fractions

大憨熊 提交于 2020-08-05 08:00:07
问题 This post is somewhat related to this post. Here I have xy grouped data where y are fractions: library(dplyr) library(ggplot2) library(ggpmisc) set.seed(1) df1 <- data.frame(value = c(0.8,0.5,0.4,0.2,0.5,0.6,0.5,0.48,0.52), age = rep(c("d2","d4","d45"),3), group = c("A","A","A","B","B","B","C","C","C")) %>% dplyr::mutate(time = as.integer(age)) %>% dplyr::arrange(group,time) %>% dplyr::mutate(group_age=paste0(group,"_",age)) df1$group_age <- factor(df1$group_age,levels=unique(df1$group_age))

Can multinomial models be estimated using Generalized Linear model?

て烟熏妆下的殇ゞ 提交于 2020-05-24 18:15:10
问题 In analysis of categorical data, we often use logistic regression to estimate relationships between binomial outcomes and one or more covariates. I understand this is a type of generalized linear model (GLM). In R, this is implemented with the glm function using the argument family=binomial . On the other hand, in categorical data analysis are multinomial models. Are these not GLMs? And can't they be estimated in R using the glm function? (In this post for Multinomial Logistic Regression. The

pass family= to step() via glm() programmatically

让人想犯罪 __ 提交于 2020-03-22 06:44:32
问题 I am trying to demonstrate via simulation the performance of different models and feature selection techniques, so I wish to pass various arguments to glm() programmatically. Under ?glm we read (italics mine): family : a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function , a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for

Dummy Variables in Julia

五迷三道 提交于 2020-01-14 07:58:13
问题 In R there is nice functionality for running a regression with dummy variables for each level of a categorical variable. e.g. Automatically expanding an R factor into a collection of 1/0 indicator variables for every factor level Is there an equivalent way to do this in Julia. x = randn(1000) group = repmat(1:25 , 40) groupMeans = randn(25) y = 3*x + groupMeans[group] data = DataFrame(x=x, y=y, g=group) for i in levels(group) data[parse("I$i")] = data[:g] .== i end lm(y~x+I1+I2+I3+I4+I5+I6+I7

Cross validation for glm() models

故事扮演 提交于 2020-01-11 15:33:50
问题 I'm trying to do a 10-fold cross validation for some glm models that I have built earlier in R. I'm a little confused about the cv.glm() function in the boot package, although I've read a lot of help files. When I provide the following formula: library(boot) cv.glm(data, glmfit, K=10) Does the "data" argument here refer to the whole dataset or only to the test set? The examples I have seen so far provide the "data" argument as the test set but that did not really make sense, such as why do 10

Can one extract model fit parameters after a ggplot stat_smooth call?

北城余情 提交于 2020-01-11 06:33:10
问题 Using stat_smooth , I can fit models to data. E.g. g=ggplot(tips,aes(x=tip,y=as.numeric(unclass(factor(tips$sex))-1))) +facet_grid(time~.) g=g+ stat_summary(fun.y=mean,geom="point") g=g+ stat_smooth(method="glm", family="binomial") I would like to know the coefficients of the glm binomial fits. I could re-do the fit with dlply and get the coefficients with ldply , but I'd like to avoid such duplication. Calling str(g) reveals the hierarchy of objects that ggplot2 creates, perhaps there's some

How to debug “contrasts can be applied only to factors with 2 or more levels” error?

偶尔善良 提交于 2020-01-07 08:02:07
问题 Here are all the variables I'm working with: str(ad.train) $ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ... $ Team : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 1 1 1 1 1 1 1 1 1 1 ... $ Season : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ... $ Round : Factor w/ 28 levels "EF","GF","PF",..: 5 16 21 22 23 24 25 26 27 6 ... $ Score : int 137 82 84 96 110 99 122 124 49 111 ... $ Margin : int 69 18 -56 46 19 5 50 69 -26 29 ... $

Using stargazer with memory greedy glm objects

坚强是说给别人听的谎言 提交于 2020-01-06 03:07:05
问题 I'm trying to run the following regression: m1=glm(y~x1+x2+x3+x4,data=df,family=binomial()) m2=glm(y~x1+x2+x3+x4+x5,data=df,family=binomial()) m3=glm(y~x1+x2+x3+x4+x5+x6,data=df,family=binomial()) m4=glm(y~x1+x2+x3+x4+x5+x6+x7,data=df,family=binomial()) and then to print them using the stargazer package: stargazer(m1,m2,m3,m4 type="html", out="models.html") Thing is, the data frame df is rather big (~600MB) and thus each glm object I create is at least ~1.5GB. This creates a memory issue

How to debug “contrasts can be applied only to factors with 2 or more levels” error?

社会主义新天地 提交于 2020-01-06 01:56:32
问题 Here are all the variables I'm working with: str(ad.train) $ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ... $ Team : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 1 1 1 1 1 1 1 1 1 1 ... $ Season : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ... $ Round : Factor w/ 28 levels "EF","GF","PF",..: 5 16 21 22 23 24 25 26 27 6 ... $ Score : int 137 82 84 96 110 99 122 124 49 111 ... $ Margin : int 69 18 -56 46 19 5 50 69 -26 29 ... $