glm | 易学教程

How to repeat a process N times?

阅读更多关于 How to repeat a process N times?

问题 I have: x = rnorm(100) # Partie b z = rbinom(100,1,0.60) # Partie c y = 1.4 + 0.7*x - 0.5*z # Partie d x1 = abs(x) y1 = abs(y) Don<-cbind(y1,x1,z) Don1 <- data.frame(Don) Reg <- glm(y1~x1+z,family=poisson(link="log"),Don1) # Partie e #Biais de beta Reg.cf <- coef(Reg) biais0 = Reg.cf[1] - 1.4 biais1 = Reg.cf[2] - 0.7 biais2 = Reg.cf[3] + 0.5 And I need to repeat all this 100 times in order to have different coefficient and calculate the bias and then put the mean of each biais in a text file.

Using geom_smooth for fitting a glm to fractions

阅读更多关于 Using geom_smooth for fitting a glm to fractions

问题 This post is somewhat related to this post. Here I have xy grouped data where y are fractions: library(dplyr) library(ggplot2) library(ggpmisc) set.seed(1) df1 <- data.frame(value = c(0.8,0.5,0.4,0.2,0.5,0.6,0.5,0.48,0.52), age = rep(c("d2","d4","d45"),3), group = c("A","A","A","B","B","B","C","C","C")) %>% dplyr::mutate(time = as.integer(age)) %>% dplyr::arrange(group,time) %>% dplyr::mutate(group_age=paste0(group,"_",age)) df1$group_age <- factor(df1$group_age,levels=unique(df1$group_age))

Can multinomial models be estimated using Generalized Linear model?

阅读更多关于 Can multinomial models be estimated using Generalized Linear model?

问题 In analysis of categorical data, we often use logistic regression to estimate relationships between binomial outcomes and one or more covariates. I understand this is a type of generalized linear model (GLM). In R, this is implemented with the glm function using the argument family=binomial . On the other hand, in categorical data analysis are multinomial models. Are these not GLMs? And can't they be estimated in R using the glm function? (In this post for Multinomial Logistic Regression. The

pass family= to step() via glm() programmatically

阅读更多关于 pass family= to step() via glm() programmatically

问题 I am trying to demonstrate via simulation the performance of different models and feature selection techniques, so I wish to pass various arguments to glm() programmatically. Under ?glm we read (italics mine): family : a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function , a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for

Dummy Variables in Julia

阅读更多关于 Dummy Variables in Julia

问题 In R there is nice functionality for running a regression with dummy variables for each level of a categorical variable. e.g. Automatically expanding an R factor into a collection of 1/0 indicator variables for every factor level Is there an equivalent way to do this in Julia. x = randn(1000) group = repmat(1:25 , 40) groupMeans = randn(25) y = 3*x + groupMeans[group] data = DataFrame(x=x, y=y, g=group) for i in levels(group) data[parse("I$i")] = data[:g] .== i end lm(y~x+I1+I2+I3+I4+I5+I6+I7

Cross validation for glm() models

阅读更多关于 Cross validation for glm() models

问题 I'm trying to do a 10-fold cross validation for some glm models that I have built earlier in R. I'm a little confused about the cv.glm() function in the boot package, although I've read a lot of help files. When I provide the following formula: library(boot) cv.glm(data, glmfit, K=10) Does the "data" argument here refer to the whole dataset or only to the test set? The examples I have seen so far provide the "data" argument as the test set but that did not really make sense, such as why do 10

Can one extract model fit parameters after a ggplot stat_smooth call?

阅读更多关于 Can one extract model fit parameters after a ggplot stat_smooth call?

问题 Using stat_smooth , I can fit models to data. E.g. g=ggplot(tips,aes(x=tip,y=as.numeric(unclass(factor(tips$sex))-1))) +facet_grid(time~.) g=g+ stat_summary(fun.y=mean,geom="point") g=g+ stat_smooth(method="glm", family="binomial") I would like to know the coefficients of the glm binomial fits. I could re-do the fit with dlply and get the coefficients with ldply , but I'd like to avoid such duplication. Calling str(g) reveals the hierarchy of objects that ggplot2 creates, perhaps there's some

How to debug “contrasts can be applied only to factors with 2 or more levels” error?

阅读更多关于 How to debug “contrasts can be applied only to factors with 2 or more levels” error?

问题 Here are all the variables I'm working with: str(ad.train) $ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ... $ Team : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 1 1 1 1 1 1 1 1 1 1 ... $ Season : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ... $ Round : Factor w/ 28 levels "EF","GF","PF",..: 5 16 21 22 23 24 25 26 27 6 ... $ Score : int 137 82 84 96 110 99 122 124 49 111 ... $ Margin : int 69 18 -56 46 19 5 50 69 -26 29 ... $

Using stargazer with memory greedy glm objects

阅读更多关于 Using stargazer with memory greedy glm objects

问题 I'm trying to run the following regression: m1=glm(y~x1+x2+x3+x4,data=df,family=binomial()) m2=glm(y~x1+x2+x3+x4+x5,data=df,family=binomial()) m3=glm(y~x1+x2+x3+x4+x5+x6,data=df,family=binomial()) m4=glm(y~x1+x2+x3+x4+x5+x6+x7,data=df,family=binomial()) and then to print them using the stargazer package: stargazer(m1,m2,m3,m4 type="html", out="models.html") Thing is, the data frame df is rather big (~600MB) and thus each glm object I create is at least ~1.5GB. This creates a memory issue

How to debug “contrasts can be applied only to factors with 2 or more levels” error?

阅读更多关于 How to debug “contrasts can be applied only to factors with 2 or more levels” error?