linear-regression

tf.metrics.accuracy not working as intended

拜拜、爱过 提交于 2019-12-02 07:11:18
问题 I have linear regression model that seems to be working fine, but I want to display the accuracy of the model. First, I initialize the variables and placeholders... X_train, X_test, Y_train, Y_test = train_test_split( X_data, Y_data, test_size=0.2 ) n_rows = X_train.shape[0] X = tf.placeholder(tf.float32, [None, 89]) Y = tf.placeholder(tf.float32, [None, 1]) W_shape = tf.TensorShape([89, 1]) b_shape = tf.TensorShape([1]) W = tf.Variable(tf.random_normal(W_shape)) b = tf.Variable(tf.random

Get p-value for group mean difference without refitting linear model with a new reference level

孤街浪徒 提交于 2019-12-02 05:30:34
When we have a linear model with a factor variable X (with levels A , B , and C ) y ~ factor(X) + Var2 + Var3 The result shows the estimate XB and XC which is differences B - A and C - A . (suppose that the reference is A ). If we want to know the p-value of the difference between B and C : C - B , we should designate B or C as a reference group and re-run the model. Can we get the p-values of the effect B - A , C - A , and C - B at one time? 李哲源 You are looking for linear hypothesis test by check p-value of some linear combination of regression coefficients. Based on my answer: How to conduct

Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

吃可爱长大的小学妹 提交于 2019-12-02 05:16:53
问题 library(lmPerm) x <- lmp(formula = a ~ b * c + d + e, data = df, perm = "Prob") summary(x) # truncated output, I can see `NA` rows here! #Coefficients: (1 not defined because of singularities) # Estimate Iter Pr(Prob) #b 5.874 51 1.000 #c -30.060 281 0.263 #b:c NA NA NA #d1 -31.333 60 0.633 #d2 33.297 165 0.382 #d3 -19.096 51 1.000 #e 1.976 NA NA I want to pull out the Pr(Prob) results for everything, but y <- summary(x)$coef[, "Pr(Prob)"] #(Intercept) b c d1 d2 # 0.09459459 1.00000000 0

Linear model singular because of large integer datetime in R?

核能气质少年 提交于 2019-12-02 04:22:38
Simple regression of random normal on date fails, but identical data with small integers instead of dates works as expected. # Example dataset with 100 observations at 2 second intervals. set.seed(1) df <- data.frame(x=as.POSIXct("2017-03-14 09:00:00") + seq(0, 199, 2), y=rnorm(100)) #> head(df) # x y # 1 2017-03-14 09:00:00 -0.6264538 # 2 2017-03-14 09:00:02 0.1836433 # 3 2017-03-14 09:00:04 -0.8356286 # Simple regression model. m <- lm(y ~ x, data=df) The slope is missing due to singularities in the data. Calling the summary demonstrates this: summary(m) # Coefficients: (1 not defined

Incorrect abline line for a regression model with intercept in R

无人久伴 提交于 2019-12-02 03:16:44
(reproducible example given) In the following, I get an abline line with y-intercept is about 30, but the regression says y-intercept should be 37.2851 Where am I wrong? mtcars$mpg # 21.0 21.0 22.8 ... 21.4 (32 obs) mtcars$wt # 2.620 2.875 2.320 ... 2.780 (32 obs) regression1 <- lm(mtcars$mpg ~ mtcars$wt) coef(regression1) # mpg ~ 37.2851 - 5.3445wt plot(mtcars$mpg ~ mtcars$wt, pch=19, col='gray50') # pch: shape of points abline(h=mean(mtcars$mpg), lwd=2, col ='darkorange') # The y-coordinate of hor'l line: 20,09062 abline(lm(mtcars$mpg ~ mtcars$wt), lwd=2, col ='sienna') I looked at all the

tf.metrics.accuracy not working as intended

…衆ロ難τιáo~ 提交于 2019-12-02 03:04:45
I have linear regression model that seems to be working fine, but I want to display the accuracy of the model. First, I initialize the variables and placeholders... X_train, X_test, Y_train, Y_test = train_test_split( X_data, Y_data, test_size=0.2 ) n_rows = X_train.shape[0] X = tf.placeholder(tf.float32, [None, 89]) Y = tf.placeholder(tf.float32, [None, 1]) W_shape = tf.TensorShape([89, 1]) b_shape = tf.TensorShape([1]) W = tf.Variable(tf.random_normal(W_shape)) b = tf.Variable(tf.random_normal(b_shape)) pred = tf.add(tf.matmul(X, W), b) cost = tf.reduce_sum(tf.pow(pred-Y, 2)/(2*n_rows-1))

Predict y value for a given x in R

懵懂的女人 提交于 2019-12-02 01:37:33
I have a linear model: mod=lm(weight~age, data=f2) I would like to input an age value and have returned the corresponding weight from this model. This is probably simple, but I have not found a simple way to do this. If your purposes are related to just one prediction you can just grab your coefficient with coef(mod) Or you can just build a simple equation like this. coef(mod)[1] + "Your_Value"*coef(mod)[2] Its usually more robust to use the predict method of lm : f2<-data.frame(age=c(10,20,30),weight=c(100,200,300)) f3<-data.frame(age=c(15,25)) mod<-lm(weight~age,data=f2) pred3<-predict(mod

R regression analysis: analyzing data for a certain ethnicity

亡梦爱人 提交于 2019-12-02 01:25:58
问题 I have a data set that investigate depression among individuals with different ethnicities (Black, White, and Latina). I want to know how depression at baseline relates to depression at post with all ethnic groups, I did lm(depression_base ~ depression_post, data=Data Now, I want to look at the relationship by ethnicity. Ethnicity in my dataset is coded as 0 = White , 1 = Black , and 2 = Latina . I am thinking that I need to use the ifelse function, but I cannot seem to get it to work. Here

R: build separate models for each category

北战南征 提交于 2019-12-02 00:53:22
Short version : How to build separate models for each category (without splitting the data). (I am new to R) Long version: consider the following synthetic data housetype,ht1,ht2,age,price O,0,1,1,1000 O,0,1,2,2000 O,0,1,3,3000 N,1,0,1,10000 N,1,0,2,20000 N,1,0,3,30000 We can model the above using two separate models if(housetype=='o') price = 1000 * age else price = 10000 * age i.e. a separate model based on category type? This is what I have tried model=lm(price~housetype+age, data=datavar) and model=lm(price~ht1+ht2+age, data = datavar) Both the above models (which is essentially the same)

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

ε祈祈猫儿з 提交于 2019-12-02 00:28:06
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1