regression | 易学教程

How to calculate Total least squares in R? (Orthogonal regression) [closed]

阅读更多关于 How to calculate Total least squares in R? (Orthogonal regression) [closed]

I didn't find a function to calculate the orthogonal regression (TLS - Total Least Squares). Is there a package with this kind of function? Update: I mean calculate the distance of each point symmetrically and not asymmetrically as lm() does. You might want to consider the Deming() function in package MethComp [ function info ]. The package also contains a detailed derivation of the theory behind Deming regression. The following search of the R Archives also provide plenty of options: Total Least Squares Deming regression Your multiple questions on CrossValidated, here and R-Help imply that

R: lm() result differs when using `weights` argument and when using manually reweighted data

阅读更多关于 R: lm() result differs when using `weights` argument and when using manually reweighted data

In order to correct heteroskedasticity in error terms, I am running the following weighted least squares regression in R : #Call: #lm(formula = a ~ q + q2 + b + c, data = mydata, weights = weighting) #Weighted Residuals: # Min 1Q Median 3Q Max #-1.83779 -0.33226 0.02011 0.25135 1.48516 #Coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -3.939440 0.609991 -6.458 1.62e-09 *** #q 0.175019 0.070101 2.497 0.013696 * #q2 0.048790 0.005613 8.693 8.49e-15 *** #b 0.473891 0.134918 3.512 0.000598 *** #c 0.119551 0.125430 0.953 0.342167 #--- #Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0

Different Robust Standard Errors of Logit Regression in Stata and R

阅读更多关于 Different Robust Standard Errors of Logit Regression in Stata and R

I am trying to replicate a logit regression from Stata to R. In Stata I use the option "robust" to have the robust standard error (heteroscedasticity-consistent standard error). I am able to replicate the exactly same coefficients from Stata, but I am not able to have the same robust standard error with the package "sandwich". I have tried some OLS linear regression examples; it seems like the sandwich estimators of R and Stata give me the same robust standard error for OLS. Does anybody know how Stata calculate the sandwich estimator for non-linear regression, in my case the logit regression?

How to change points and add a regression to a cloudplot (using R)?

阅读更多关于 How to change points and add a regression to a cloudplot (using R)?

To make clear what I'm asking I've created an easy example. Step one is to create some data: gender <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2),labels = c("male", "female")) numberofdrugs <- rpois(84, 50) + 1 geneticvalue <- rpois(84,75) death <- rpois(42,50) + 15 y <- data.frame(death, numberofdrugs, geneticvalue, gender) So these are some random dates merged to one data.frame . So from these dates I'd like to plot a cloud where I can differ between the males and females and where I add two simple regressions (one for females and one for males). So I've started, but I couldn't get to

How to put a complicated equation into a R formula?

阅读更多关于 How to put a complicated equation into a R formula?

问题 We have the diameter of trees as the predictor and tree height as the dependent variable. A number of different equations exist for this kind of data and we try to model some of them and compare the results. However, we we can't figure out how to correctly put one equation into the corresponding R formula format. The trees data set in R can be used as an example. data(trees) df <- trees df$h <- df$Height * 0.3048 #transform to metric system df$dbh <- (trees$Girth * 0.3048) / pi #transform

tensorflow deep neural network for regression always predict same results in one batch

阅读更多关于 tensorflow deep neural network for regression always predict same results in one batch

I use a tensorflow to implement a simple multi-layer perceptron for regression. The code is modified from standard mnist classifier, that I only changed the output cost to MSE (use tf.reduce_mean(tf.square(pred-y)) ), and some input, output size settings. However, if I train the network using regression, after several epochs, the output batch are totally the same. for example: target: 48.129, estimated: 42.634 target: 46.590, estimated: 42.634 target: 34.209, estimated: 42.634 target: 69.677, estimated: 42.634 ...... I have tried different batch size, different initialization, input

Plotting confidence and prediction intervals with repeated entries

阅读更多关于 Plotting confidence and prediction intervals with repeated entries

问题 I have a correlation plot for two variables, the predictor variable (temperature) on the x-axis, and the response variable (density) on the y-axis. My best fit least squares regression line is a 2nd order polynomial. I would like to also plot confidence and prediction intervals. The method described in this answer seems perfect. However, my dataset (n=2340) has repeated entries for many (x,y) pairs. My resulting plot looks like this: Here is my relevant code (slightly modified from linked

Quadratic and cubic regression in Excel

阅读更多关于 Quadratic and cubic regression in Excel

I have the following information: Height Weight 170 65 167 55 189 85 175 70 166 55 174 55 169 69 170 58 184 84 161 56 170 75 182 68 167 51 187 85 178 62 173 60 172 68 178 55 175 65 176 70 I want to construct quadratic and cubic regression analysis in Excel. I know how to do it by linear regression in Excel, but what about quadratic and cubic? I have searched a lot of resources, but could not find anything helpful. Ian Boyd You need to use an undocumented trick with Excel's LINEST function: =LINEST(known_y's, [known_x's], [const], [stats]) Background A regular linear regression is calculated

Is there a fast estimation of simple regression (a regression line with only intercept and slope)?

阅读更多关于 Is there a fast estimation of simple regression (a regression line with only intercept and slope)?

This question relates to a machine learning feature selection procedure. I have a large matrix of features - columns are the features of the subjects (rows): set.seed(1) features.mat <- matrix(rnorm(10*100),ncol=100) colnames(features.mat) <- paste("F",1:100,sep="") rownames(features.mat) <- paste("S",1:10,sep="") The response was measured for each subject ( S ) under different conditions ( C ) and therefore looks like this: response.df <- data.frame(S = c(sapply(1:10, function(x) rep(paste("S", x, sep = ""),100))), C = rep(paste("C", 1:100, sep = ""), 10), response = rnorm(1000),

Model matrix with all pairwise interactions between columns

阅读更多关于 Model matrix with all pairwise interactions between columns

问题 Let's say that I have a numeric data matrix with columns w, x, y, z and I also want to add in the columns that are equivalent to w*x, w*y, w*z, x*y, x*z, y*z since I want my covariate matrix to include all pairwise interactions. Is there a clean and effective way to do this? 回答1: If you mean in a model formula , then the ^ operator does this. ## dummy data set.seed(1) dat <- data.frame(Y = rnorm(10), x = rnorm(10), y = rnorm(10), z = rnorm(10)) The formula is form <- Y ~ (x + y + z)^2 which