regression | 易学教程

Showing fitted values with R and dplyr

阅读更多关于 Showing fitted values with R and dplyr

I have the data frame DF . I am using R and dplyr to analise it. DF contains: > glimpse(DF) Observations: 1244160 Variables: $ Channel (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... $ Row (int) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,... $ Col (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1... $ mean (dbl) 776.0667, 786.6000, 833.4667, 752.3333, 831.6667, 772.9333... I fit it with: Fit <- DF %>% group_by(Channel) %>% do(fit = lm(mean ~ Col + poly(Row, 2), data = .)) How can I get another column in DF with the data (given Channel , Row and

Seaborn barplot with regression line

阅读更多关于 Seaborn barplot with regression line

问题 Is there a way to add a regression line to a barplot in seaborn where the x axis contains pandas.Timestamps? For example, overlay a trendline in this bar plot below. Am looking for the most efficient way to do this: seaborn.set(style="white", context="talk") a = pandas.DataFrame.from_dict({'Attendees': {pandas.Timestamp('2016-12-01'): 10, pandas.Timestamp('2017-01-01'): 12, pandas.Timestamp('2017-02-01'): 15, pandas.Timestamp('2017-03-01'): 16, pandas.Timestamp('2017-04-01'): 20}}) ax =

Showing fitted values with R and dplyr

阅读更多关于 Showing fitted values with R and dplyr

问题 I have the data frame DF . I am using R and dplyr to analise it. DF contains: > glimpse(DF) Observations: 1244160 Variables: $ Channel (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... $ Row (int) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,... $ Col (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1... $ mean (dbl) 776.0667, 786.6000, 833.4667, 752.3333, 831.6667, 772.9333... I fit it with: Fit <- DF %>% group_by(Channel) %>% do(fit = lm(mean ~

Why do I get NA coefficients and how does `lm` drop reference level for interaction

阅读更多关于 Why do I get NA coefficients and how does `lm` drop reference level for interaction

I am trying to understand how R determines reference groups for interactions in a linear model. Consider the following: df <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), year = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), treatment = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,

How to compute standard error from ODR results?

阅读更多关于 How to compute standard error from ODR results?

I use scipy.odr in order to make a fit with uncertainties on both x and y following this question Correct fitting with scipy curve_fit including errors in x? After the fit I would like to compute the uncertainties on the parameters. Thus I look at the square root of the diagonal elements of the covariance matrix. I get : >>> print(np.sqrt(np.diag(output.cov_beta))) [ 0.17516591 0.33020487 0.27856021] But in the Output there is also output.sd_beta which is, according to the doc on odr Standard errors of the estimated parameters, of shape (p,). But, it does not give me the same results : >>>

How to create a graph showing the predictive model, data and residuals in R

阅读更多关于 How to create a graph showing the predictive model, data and residuals in R

Given two variables, x and y , I run a dynlm regression on the variables and would like to plot the fitted model against one of the variables and the residual on the bottom showing how the actual data line differs from the predicting line. I've seen it done before and I've done it before, but for the life of me I can't remember how to do it or find anything that explains it. This gets me into the ballpark where I have a model and two variables, but I can't get the type of graph I want. library(dynlm) x <- rnorm(100) y <- rnorm(100) model <- dynlm(x ~ y) plot(x, type="l", col="red") lines(y,

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1.295, 1.438, -0.638, 0.716, 1.004, -1.328, -1.759, -1.315, 1.053, 1.958, -2.034, 2.936, -0.078, -0.676, -2

Regression for a Rate variable in R

阅读更多关于 Regression for a Rate variable in R

问题 I was tasked with developing a regression model looking at student enrollment in different programs. This is a very nice, clean data set where the enrollment counts follow a Poisson distribution well. I fit a model in R (using both GLM and Zero Inflated Poisson.) The resulting residuals seemed reasonable. However, I was then instructed to change the count of students to a "rate" which was calculated as students / school_population (Each school has its own population.)) This is now no longer a

Activation function for output layer for regression models in Neural Networks

阅读更多关于 Activation function for output layer for regression models in Neural Networks

I have been experimenting with neural networks these days. I have come across a general question regarding the activation function to use. This might be a well known fact to but I couldn't understand properly. A lot of the examples and papers I have seen are working on classification problems and they either use sigmoid (in binary case) or softmax (in multi-class case) as the activation function in the out put layer and it makes sense. But I haven't seen any activation function used in the output layer of a regression model. So my question is that is it by choice we don't use any activation

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

问题 This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1