linear-regression

Machine learning - Linear regression using batch gradient descent

冷暖自知 提交于 2019-11-27 11:37:11
I am trying to implement batch gradient descent on a data set with a single feature and multiple training examples ( m ). When I try using the normal equation, I get the right answer but the wrong one with this code below which performs batch gradient descent in MATLAB. function [theta] = gradientDescent(X, y, theta, alpha, iterations) m = length(y); delta=zeros(2,1); for iter =1:1:iterations for i=1:1:m delta(1,1)= delta(1,1)+( X(i,:)*theta - y(i,1)) ; delta(2,1)=delta(2,1)+ (( X(i,:)*theta - y(i,1))*X(i,2)) ; end theta= theta-( delta*(alpha/m) ); computeCost(X,y,theta) end end y is the

Linear Regression with a known fixed intercept in R

孤者浪人 提交于 2019-11-27 10:57:09
I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm() . I found an example on the internet and I tried to read the R-help "?lm" (unfortunately I'm not able to understand it), but I did not succeed. Can anyone tell me where my mistake is? lin <- data.frame(x = c(0:6), y = c(0.3, 0.1, 0.9, 3.1, 5, 4.9, 6.2)) plot (lin$x, lin$y) regImp = lm(formula = lin$x ~ lin$y) abline(regImp, col="blue") # Does not work: # Use 1 as intercept explicitIntercept = rep(1, length(lin$x)) regExp = lm

Messy plot when plotting predictions of a polynomial regression using lm() in R

自古美人都是妖i 提交于 2019-11-27 09:52:53
I am building a quadratic model with lm in R: y <- data[[1]] x <- data[[2]] x2 <- x^2 quadratic.model = lm(y ~ x + x2) Now I want to display both the predicted values and the actual values on a plot. I tried this: par(las=1,bty="l") plot(y~x) P <- predict(quadratic.model) lines(x, P) but the line comes up all squiggely. Maybe it has to do with the fact that it's quadratic? Thanks for any help. 李哲源 You need order() : P <- predict(quadratic.model) plot(y~x) reorder <- order(x) lines(x[reorder], P[reorder]) My answer here is related: Problems displaying LOESS regression line and confidence

Are there any Linear Regression Function in SQL Server?

梦想的初衷 提交于 2019-11-27 09:31:31
问题 Are there any Linear Regression Function in SQL Server 2005/2008, similar to the the Linear Regression functions in Oracle ? 回答1: To the best of my knowledge, there is none. Writing one is pretty straightforward, though. The following gives you the constant alpha and slope beta for y = Alpha + Beta * x + epsilon: -- test data (GroupIDs 1, 2 normal regressions, 3, 4 = no variance) WITH some_table(GroupID, x, y) AS ( SELECT 1, 1, 1 UNION SELECT 1, 2, 2 UNION SELECT 1, 3, 1.3 UNION SELECT 1, 4,

OLS Regression: Scikit vs. Statsmodels?

只谈情不闲聊 提交于 2019-11-27 09:30:06
问题 Short version : I was using the scikit LinearRegression on some data, but I'm used to p-values so put the data into the statsmodels OLS, and although the R^2 is about the same the variable coefficients are all different by large amounts. This concerns me since the most likely problem is that I've made an error somewhere and now I don't feel confident in either output (since likely I have made one model incorrectly but don't know which one). Longer version : Because I don't know where the

Error in Confusion Matrix : the data and reference factors must have the same number of levels

雨燕双飞 提交于 2019-11-27 09:04:24
I've trained a Linear Regression model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error: Error in confusionMatrix.default(pred, testing$Final) : the data and reference factors must have the same number of levels EnglishMarks <- read.csv("E:/Subject Wise Data/EnglishMarks.csv", header=TRUE) inTrain<-createDataPartition(y=EnglishMarks$Final,p=0.7,list=FALSE) training<-EnglishMarks[inTrain,] testing<-EnglishMarks[-inTrain,] predictionsTree <- predict(treeFit, testdata) confusionMatrix(predictionsTree, testdata$catgeory) modFit<-train(Final~UT1+UT2

How `poly()` generates orthogonal polynomials? How to understand the “coefs” returned?

丶灬走出姿态 提交于 2019-11-27 08:28:50
My understanding of orthogonal polynomials is that they take the form y(x) = a1 + a2(x - c1) + a3(x - c2)(x - c3) + a4(x - c4)(x - c5)(x - c6)... up to the number of terms desired where a1 , a2 etc are coefficients to each orthogonal term (vary between fits), and c1 , c2 etc are coefficients within the orthogonal terms, determined such that the terms maintain orthogonality (consistent between fits using the same x values) I understand poly() is used to fit orthogonal polynomials. An example x = c(1.160, 1.143, 1.126, 1.109, 1.079, 1.053, 1.040, 1.027, 1.015, 1.004, 0.994, 0.985, 0.977) #

R - Calculate Test MSE given a trained model from a training set and a test set

谁说我不能喝 提交于 2019-11-27 07:55:23
问题 Given two simple sets of data: head(training_set) x y 1 1 2.167512 2 2 4.684017 3 3 3.702477 4 4 9.417312 5 5 9.424831 6 6 13.090983 head(test_set) x y 1 1 2.068663 2 2 4.162103 3 3 5.080583 4 4 8.366680 5 5 8.344651 I want to fit a linear regression line on the training data, and use that line (or the coefficients) to calculate the "test MSE" or Mean Squared Error of the Residuals on the test data once that line is fit there. model = lm(y~x,data=training_set) train_MSE = mean(model$residuals

Adding a regression line on a ggplot

浪尽此生 提交于 2019-11-27 06:51:19
I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this... data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50)) ggplot(data,aes(x.plot,y.plot))+stat_summary(fun.data=mean_cl_normal) + geom_smooth(method='lm',formula=data$y.plot~data$x.plot) But it is not working either. In general, to provide your own formula you should use arguments x and y that will correspond to values you provided in ggplot() - in this case x will be interpreted as x.plot and y as y.plot . More information about smoothing methods and

linear model with `lm`: how to get prediction variance of sum of predicted values

有些话、适合烂在心里 提交于 2019-11-27 06:20:39
问题 I'm summing the predicted values from a linear model with multiple predictors, as in the example below, and want to calculate the combined variance, standard error and possibly confidence intervals for this sum. lm.tree <- lm(Volume ~ poly(Girth,2), data = trees) Suppose I have a set of Girths : newdat <- list(Girth = c(10,12,14,16) for which I want to predict the total Volume : pr <- predict(lm.tree, newdat, se.fit = TRUE) total <- sum(pr$fit) # [1] 111.512 How can I obtain the variance for