regression | 易学教程

R regression analysis: analyzing data for a certain ethnicity

阅读更多关于 R regression analysis: analyzing data for a certain ethnicity

问题 I have a data set that investigate depression among individuals with different ethnicities (Black, White, and Latina). I want to know how depression at baseline relates to depression at post with all ethnic groups, I did lm(depression_base ~ depression_post, data=Data Now, I want to look at the relationship by ethnicity. Ethnicity in my dataset is coded as 0 = White , 1 = Black , and 2 = Latina . I am thinking that I need to use the ifelse function, but I cannot seem to get it to work. Here

R: build separate models for each category

阅读更多关于 R: build separate models for each category

Short version : How to build separate models for each category (without splitting the data). (I am new to R) Long version: consider the following synthetic data housetype,ht1,ht2,age,price O,0,1,1,1000 O,0,1,2,2000 O,0,1,3,3000 N,1,0,1,10000 N,1,0,2,20000 N,1,0,3,30000 We can model the above using two separate models if(housetype=='o') price = 1000 * age else price = 10000 * age i.e. a separate model based on category type? This is what I have tried model=lm(price~housetype+age, data=datavar) and model=lm(price~ht1+ht2+age, data = datavar) Both the above models (which is essentially the same)

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

阅读更多关于 Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1

Change reference group using glm with binomial family

阅读更多关于 Change reference group using glm with binomial family

When I run a binomial regression in R with an independed factor variable consisting of three levels "Higher" , "Middle" and "Lower" of which I want to change the reference category using relevel I get this error: “Error in relevel.ordered(cbsnivcat3, "Lower") : 'relevel' only for factors” I have checked whether cbsnivcat3 is a factor > is.factor(data$cbsnivcat3) [1] TRUE > levels(data$cbsnivcat3) [1] "Higher" "Middle" "Lower" > t1m4=glm(tertiary ~ relevel(cbsnivcat3, "Lower") , family = binomial, data = data) Error in relevel.ordered(cbsnivcat3, "Lower") : 'relevel' only for factors but the

Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

阅读更多关于 Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

I'm using the GROWTH (or LINEST or TREND or LOGEST, all make the same trouble) function in Excel 2003. But there is a problem that if some data is missing, the function refuses to give result: You can download the file here . Is there any workaround? Looking for easy and elegant solution. I don't want the obvious workaround of getting rid of the missing value - that would mean to delete the column and that would also damage the graph, and it would make problems in my other tables where I have more rows and missing data in different columns. Other obvious workaround is to use one data for

How to interpret MSE in Keras Regressor

阅读更多关于 How to interpret MSE in Keras Regressor

I am new to Keras/TF/Deep Learning and I am trying to build a model to predict house prices. I have some features X (no. of bathrooms , etc.) and target Y (ranging around $300,000 to $800,000) I have used sklearn's Standard Scaler to standardize Y before fitting it to the model. Here is my Keras model: def build_model(): model = Sequential() model.add(Dense(36, input_dim=36, activation='relu')) model.add(Dense(18, input_dim=36, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='mse', optimizer='sgd', metrics=['mae','mse']) return model I am having trouble trying

Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

阅读更多关于 Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

library(lmPerm) x <- lmp(formula = a ~ b * c + d + e, data = df, perm = "Prob") summary(x) # truncated output, I can see `NA` rows here! #Coefficients: (1 not defined because of singularities) # Estimate Iter Pr(Prob) #b 5.874 51 1.000 #c -30.060 281 0.263 #b:c NA NA NA #d1 -31.333 60 0.633 #d2 33.297 165 0.382 #d3 -19.096 51 1.000 #e 1.976 NA NA I want to pull out the Pr(Prob) results for everything, but y <- summary(x)$coef[, "Pr(Prob)"] #(Intercept) b c d1 d2 # 0.09459459 1.00000000 0.26334520 0.63333333 0.38181818 # d3 e # 1.00000000 NA This is not what I want. I need b:c row, too, in the

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

阅读更多关于 Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1,predict(fit,x_plt_exp1),lwd=3,col=2) x_plt2=seq(-1,1,0.01) x_plt_exp2=as.data.frame(poly(x_plt2,poly_df)) lines(x_plt2,predict(fit,x_plt_exp2),lwd=3,col=3) 李哲源 This is a coding / programming problem as on my quick run I

Robust se. (vcovHC) to be shown with texreg in R

阅读更多关于 Robust se. (vcovHC) to be shown with texreg in R

I am doing some regressions with the plm package, then if needed, I also obtain heteroskedasticity consistent coefficients. Below are the commands that I run; library(plm) data("Produc", package = "plm") zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state","year")) summary(zz) coeftest(zz, vcovHC) My problem starts here. Below is the list of commands to obtain a Latex output with the help of the texreg. How can I integrate the result obtained with the coeftest command into the Latex output? latex_reg <- texreg(list(coeftest_result), scriptsize=TRUE)

How to plot a linear regression to a double logarithmic R plot?

阅读更多关于 How to plot a linear regression to a double logarithmic R plot?

I have the following data: someFactor = 500 x = c(1:250) y = x^-.25 * someFactor which I show in a double logarithmic plot: plot(x, y, log="xy") Now I "find out" the slope of the data using a linear model: model = lm(log(y) ~ log(x)) model which gives: Call: lm(formula = log(y) ~ log(x)) Coefficients: (Intercept) log(x) 6.215 -0.250 Now I'd like to plot the linear regression as a red line, but abline does not work: abline(model, col="red") What is the easiest way to add a regression line to my plot? lines(log(x), exp(predict(model, newdata=list(x=log(x)))) ,col="red") The range of values for x