glm

glm starting values not accepted log-link

对着背影说爱祢 提交于 2019-12-06 03:34:24
问题 I want to run a Gaussian GLM with a log link and an offset. The following problems arise: y <- c(1,1,0,0) t <- c(5,3,2,4) No problem: exp(coef(glm(y~1 + offset(log(t)), family=poisson))) with family=gaussian , starting values need to be specified, it works here: exp(coef(glm(y~1, family=gaussian(link=log), start=0))) but does not work here: exp(coef(glm(y~1 + offset(log(t)), family=gaussian(link=log), start=0))) Error in eval(expr, envir, enclos) : cannot find valid starting values: please

Why the auc is so different from logistic regression of sklearn and R

这一生的挚爱 提交于 2019-12-06 02:58:37
I use a same dataset to train logistic regression model both in R and python sklearn. The dataset is unbalanced. And I find that the auc is quite different. This is the code of python: model_logistic = linear_model.LogisticRegression() #auc 0.623 model_logistic.fit(train_x, train_y) pred_logistic = model_logistic.predict(test_x) #mean:0.0235 var:0.023 print "logistic auc: ", sklearn.metrics.roc_auc_score(test_y,pred_logistic) This is the code of R: glm_fit <- glm(label ~ watch_cnt_7 + bid_cnt_7 + vi_cnt_itm_1 + ITEM_PRICE + add_to_cart_cnt_7 + offer_cnt_7 + dwell_dlta_4to2 + vi_cnt_itm_2 + asq

GLM fit (logistic regression) to SQL

♀尐吖头ヾ 提交于 2019-12-06 02:16:42
We frequently score data in database directly for simple models like linear or logisitc regression. It is always a little bit tricky to transfer all coefficients from R to SQL correctly. I thought I can make some R to SQL translation for glm result. For numeric variables this is pretty straightforward: library(rpart) fit <- glm(Kyphosis ~ ., data = kyphosis, family = binomial()) coefs <- fit$coef[2:length(fit$coef)] expr <- paste0('1/(1 + exp(-(',fit$coef[1], '+', paste0('(', coefs, '*', names(coefs), ')', collapse = '+'),')))') print(expr) a <- with(kyphosis, eval(parse(text = expr))) b <-

H2o GLM interact only certain predictors

大城市里の小女人 提交于 2019-12-05 20:06:22
I'm interested in creating interaction terms in h2o.glm(). But I do not want to generate all pairwise interactions. For example, in the mtcars dataset...I want to interact 'mpg' with all the other factors such as 'cyl','hp', and 'disp' but I don't want the other factors to interact with each other (so I don't want disp_hp or disp_cyl). How should I best approach this problem using the (interactions = interactions_list) parameter in h2o.glm() ? Thank you According to ?h2o.glm the interactions= parameter takes: A list of predictor column indices to interact. All pairwise combinations will be

R - using glm inside a data.table

♀尐吖头ヾ 提交于 2019-12-05 08:01:23
I'm trying to do some glm's inside a data.table to produce modelled results split by key factors. I've been doing this sucessfully for: High level glm glm(modellingDF,formula=Outcome~IntCol + DecCol,family=binomial(link=logit)) Scoped glm with single columns modellingDF[,list(Outcome, fitted=glm(x,formula=Outcome~IntCol ,family=binomial(link=logit))$fitted ), by=variable] Scoped glm with two integer columns modellingDF[,list(Outcome, fitted=glm(x,formula=Outcome~IntCol + IntCol2 ,family=binomial(link=logit))$fitted ), by=variable] But, when I try and do the high level glm inside the scope with

Missing object error when using step() within a user-defined function

送分小仙女□ 提交于 2019-12-04 23:47:10
问题 5 days and still no answer As can be seen by Simon's comment, this is a reproducible and very strange issue. It seems that the issue only arises when a stepwise regression with very high predictive power is wrapped in a function. I have been struggling with this for a while and any help would be much appreciated. I am trying to write a function that runs several stepwise regressions and outputs all of them to a list. However, R is having trouble reading the dataset that I specify in my

Fractional Response Regression in R

巧了我就是萌 提交于 2019-12-04 12:52:45
I am trying to model my data in which the response variable is between 0 and 1, so I have decided to use fractional response model in R. From my current understanding, the fractional response model is similar to logistic regression, but it uses qausi-likelihood method to determine parameters. I am not sure I understand it correctly. So far what I have tried is the frm from package frm and glm on the following data, which is the same as this OP library(foreign) mydata <- read.dta("k401.dta") Further, I followed the procedures in this OP in which glm is used. However, with the same dataset with

MCMCglmm multinomial model in R

℡╲_俬逩灬. 提交于 2019-12-04 11:49:04
问题 I'm trying to create a model using the MCMCglmm package in R. The data are structured as follows, where dyad, focal, other are all random effects, predict1-2 are predictor variables, and response 1-5 are outcome variables that capture # of observed behaviors of different subtypes: dyad focal other r present village resp1 resp2 resp3 resp4 resp5 1 10101 14302 0.5 3 1 0 0 4 0 5 2 10405 11301 0.0 5 0 0 0 1 0 1 … So a model with only one outcome (teaching) is as follows: prior_overdisp_i <- list

Predict.glm not predicting missing values in response

核能气质少年 提交于 2019-12-04 11:23:03
问题 For some reason, when I specify glms (and lm's too, it turns out), R is not predicting missing values of the data. Here is an example: y = round(runif(50)) y = c(y,rep(NA,50)) x = rnorm(100) m = glm(y~x, family=binomial(link="logit")) p = predict(m,na.action=na.pass) length(p) y = round(runif(50)) y = c(y,rep(NA,50)) x = rnorm(100) m = lm(y~x) p = predict(m) length(p) The length of p should be 100, but its 50. The weird thing is that I have other predicts in the same script that do predict

How to set the Coefficient Value in Regression; R

↘锁芯ラ 提交于 2019-12-04 09:58:25
I'm looking for a way to specify the value of a predictor variable. When I run a glm with my current data, the coefficient for one of my variables is close to one. I'd like to set it at .8. I know this will give me a lower R^2 value, but I know a priori that the predictive power of the model will be greater. The weights component of glm looks promising, but I haven't figured it out yet. Any help would be greatly appreciated. I believe you are looking for the offset argument in glm . So for example, you might do something like this: glm(y ~ x1, offset = x2,...) where in this case the