regression

R regression analysis: analyzing data for a certain ethnicity

亡梦爱人 提交于 2019-12-02 01:25:58
问题 I have a data set that investigate depression among individuals with different ethnicities (Black, White, and Latina). I want to know how depression at baseline relates to depression at post with all ethnic groups, I did lm(depression_base ~ depression_post, data=Data Now, I want to look at the relationship by ethnicity. Ethnicity in my dataset is coded as 0 = White , 1 = Black , and 2 = Latina . I am thinking that I need to use the ifelse function, but I cannot seem to get it to work. Here

R: build separate models for each category

北战南征 提交于 2019-12-02 00:53:22
Short version : How to build separate models for each category (without splitting the data). (I am new to R) Long version: consider the following synthetic data housetype,ht1,ht2,age,price O,0,1,1,1000 O,0,1,2,2000 O,0,1,3,3000 N,1,0,1,10000 N,1,0,2,20000 N,1,0,3,30000 We can model the above using two separate models if(housetype=='o') price = 1000 * age else price = 10000 * age i.e. a separate model based on category type? This is what I have tried model=lm(price~housetype+age, data=datavar) and model=lm(price~ht1+ht2+age, data = datavar) Both the above models (which is essentially the same)

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

ε祈祈猫儿з 提交于 2019-12-02 00:28:06
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1

Change reference group using glm with binomial family

做~自己de王妃 提交于 2019-12-02 00:11:30
When I run a binomial regression in R with an independed factor variable consisting of three levels "Higher" , "Middle" and "Lower" of which I want to change the reference category using relevel I get this error: “Error in relevel.ordered(cbsnivcat3, "Lower") : 'relevel' only for factors” I have checked whether cbsnivcat3 is a factor > is.factor(data$cbsnivcat3) [1] TRUE > levels(data$cbsnivcat3) [1] "Higher" "Middle" "Lower" > t1m4=glm(tertiary ~ relevel(cbsnivcat3, "Lower") , family = binomial, data = data) Error in relevel.ordered(cbsnivcat3, "Lower") : 'relevel' only for factors but the

Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

我只是一个虾纸丫 提交于 2019-12-02 00:06:31
I'm using the GROWTH (or LINEST or TREND or LOGEST, all make the same trouble) function in Excel 2003. But there is a problem that if some data is missing, the function refuses to give result: You can download the file here . Is there any workaround? Looking for easy and elegant solution. I don't want the obvious workaround of getting rid of the missing value - that would mean to delete the column and that would also damage the graph, and it would make problems in my other tables where I have more rows and missing data in different columns. Other obvious workaround is to use one data for

How to interpret MSE in Keras Regressor

狂风中的少年 提交于 2019-12-01 23:27:53
I am new to Keras/TF/Deep Learning and I am trying to build a model to predict house prices. I have some features X (no. of bathrooms , etc.) and target Y (ranging around $300,000 to $800,000) I have used sklearn's Standard Scaler to standardize Y before fitting it to the model. Here is my Keras model: def build_model(): model = Sequential() model.add(Dense(36, input_dim=36, activation='relu')) model.add(Dense(18, input_dim=36, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='mse', optimizer='sgd', metrics=['mae','mse']) return model I am having trouble trying

Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

天大地大妈咪最大 提交于 2019-12-01 23:23:53
library(lmPerm) x <- lmp(formula = a ~ b * c + d + e, data = df, perm = "Prob") summary(x) # truncated output, I can see `NA` rows here! #Coefficients: (1 not defined because of singularities) # Estimate Iter Pr(Prob) #b 5.874 51 1.000 #c -30.060 281 0.263 #b:c NA NA NA #d1 -31.333 60 0.633 #d2 33.297 165 0.382 #d3 -19.096 51 1.000 #e 1.976 NA NA I want to pull out the Pr(Prob) results for everything, but y <- summary(x)$coef[, "Pr(Prob)"] #(Intercept) b c d1 d2 # 0.09459459 1.00000000 0.26334520 0.63333333 0.38181818 # d3 e # 1.00000000 NA This is not what I want. I need b:c row, too, in the

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

 ̄綄美尐妖づ 提交于 2019-12-01 22:24:45
Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1,predict(fit,x_plt_exp1),lwd=3,col=2) x_plt2=seq(-1,1,0.01) x_plt_exp2=as.data.frame(poly(x_plt2,poly_df)) lines(x_plt2,predict(fit,x_plt_exp2),lwd=3,col=3) 李哲源 This is a coding / programming problem as on my quick run I

Robust se. (vcovHC) to be shown with texreg in R

主宰稳场 提交于 2019-12-01 22:15:23
I am doing some regressions with the plm package, then if needed, I also obtain heteroskedasticity consistent coefficients. Below are the commands that I run; library(plm) data("Produc", package = "plm") zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state","year")) summary(zz) coeftest(zz, vcovHC) My problem starts here. Below is the list of commands to obtain a Latex output with the help of the texreg. How can I integrate the result obtained with the coeftest command into the Latex output? latex_reg <- texreg(list(coeftest_result), scriptsize=TRUE)

How to plot a linear regression to a double logarithmic R plot?

我只是一个虾纸丫 提交于 2019-12-01 21:55:13
I have the following data: someFactor = 500 x = c(1:250) y = x^-.25 * someFactor which I show in a double logarithmic plot: plot(x, y, log="xy") Now I "find out" the slope of the data using a linear model: model = lm(log(y) ~ log(x)) model which gives: Call: lm(formula = log(y) ~ log(x)) Coefficients: (Intercept) log(x) 6.215 -0.250 Now I'd like to plot the linear regression as a red line, but abline does not work: abline(model, col="red") What is the easiest way to add a regression line to my plot? lines(log(x), exp(predict(model, newdata=list(x=log(x)))) ,col="red") The range of values for x