linear-regression | 易学教程

Why do I get NA coefficients and how does `lm` drop reference level for interaction

阅读更多关于 Why do I get NA coefficients and how does `lm` drop reference level for interaction

I am trying to understand how R determines reference groups for interactions in a linear model. Consider the following: df <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), year = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), treatment = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,

Why does the number of rows change during AIC in R? How to ensure that this doesn't happen?

阅读更多关于 Why does the number of rows change during AIC in R? How to ensure that this doesn't happen?

I'm trying to find a minimal adequate model using AIC in R. I keep getting the following error: Error in step(model) : number of rows in use has changed: remove missing values? My data: data<-structure(list(ID = c(1L, 2L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 27L, 28L, 29L, 30L, 31L, 33L, 34L, 35L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L,

How does the subset argument work in the lm() function?

阅读更多关于 How does the subset argument work in the lm() function?

I have been trying to figure out how the subset argument in R's lm() function works. Especially the follwoing code seems dubious for me: data(mtcars) summary(lm(mpg ~ wt, data=mtcars)) summary(lm(mpg ~ wt, cyl, data=mtcars)) In every case the regression has 32 observations dim(lm(mpg ~ wt, cyl ,data=mtcars)$model) [1] 32 2 dim(lm(mpg ~ wt ,data=mtcars)$model) [1] 32 2 yet the coefficients change (along with the R²). The help doesn't provide too much information on this matter: subset an optional vector specifying a subset of observations to be used in the fitting process As a general principle

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1.295, 1.438, -0.638, 0.716, 1.004, -1.328, -1.759, -1.315, 1.053, 1.958, -2.034, 2.936, -0.078, -0.676, -2

linear regression using lm() - surprised by the result

阅读更多关于 linear regression using lm() - surprised by the result

I used a linear regression on data I have, using the lm function. Everything works (no error message), but I'm somehow surprised by the result: I am under the impression R "misses" a group of points, i.e. the intercept and slope are not the best fit. For instance, I am referring to the group of points at coordinates x=15-25,y=0-20. My questions: is there a function to compare fit with "expected" coefficients and "lm-calculated" coefficients? have I made a silly mistake when coding, leading the lm to do that? Following some answers: additionnal information on x and y x and y are both visual

Moving window regression

阅读更多关于 Moving window regression

I want to perform a moving window regression on every pixel of two raster stacks representing Band3 and Band4 of Landsat data. The result should be two additional stacks, one representing the Intercept and the other one representing the slope of the regression. So layer 1 of stack "B3" and stack "B4" result in layer 1 of stack "intercept" and stack "slope". Layer 2 of stack B3 and stack B4 result in layer 2,.... and so on. I already came along the gwr function, but want to stay in the raster package. I somehow know that focal must be included in order to set my moving window (which should be

How does the subset argument work in the lm() function?

阅读更多关于 How does the subset argument work in the lm() function?

问题 I have been trying to figure out how the subset argument in R's lm() function works. Especially the follwoing code seems dubious for me: data(mtcars) summary(lm(mpg ~ wt, data=mtcars)) summary(lm(mpg ~ wt, cyl, data=mtcars)) In every case the regression has 32 observations dim(lm(mpg ~ wt, cyl ,data=mtcars)$model) [1] 32 2 dim(lm(mpg ~ wt ,data=mtcars)$model) [1] 32 2 yet the coefficients change (along with the R²). The help doesn't provide too much information on this matter: subset an

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

问题 This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1

How to extract a particular value from the OLS-summary in Pandas?

阅读更多关于 How to extract a particular value from the OLS-summary in Pandas?

问题 is it possible to get other values (currently I know only a way to get beta and intercept) from the summary of linear regression in pandas? I need to get R-squared. Here is an extraction from manual: In [244]: model = ols(y=rets['AAPL'], x=rets.ix[:, ['GOOG']]) In [245]: model Out[245]: -------------------------Summary of Regression Analysis--------------------- ---- Formula: Y ~ <GOOG> + <intercept> Number of Observations: 756 Number of Degrees of Freedom: 2 R-squared: 0.2814 Adj R-squared:

Why do different methods for solving Xc=y in python give different solution when they should not?

阅读更多关于 Why do different methods for solving Xc=y in python give different solution when they should not?

I was trying to solve a linear system Xc=y that was square. The methods I know to solve this are: using inverse c=<X^-1,y> using Gaussian elimination using the pseudo-inverse It seems as far as I can tell that these don't match what I thought would be the ground truth. First generate the truth parameters by fitting a polynomial of degree 30 to a cosine with frequency 5. So I have y_truth = X*c_truth . Then I check if the above three methods match the truth I tried it but the methods don't seem to match and I don't see why that should be the case. I produced fully runnable reproducible code: