linear-regression | 易学教程

Get number of data in each factor level (as well as interaction) from a fitted lm or glm [R]

阅读更多关于 Get number of data in each factor level (as well as interaction) from a fitted lm or glm [R]

问题 I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous (in addition to the response variable, which is also obviously categorical/binary). When calling summary(model_name) , is there a way to include a column representing the number of observations within each factor level? 回答1: I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous. If all your covariates are factors

R: How to update model frame after reducing model formula

阅读更多关于 R: How to update model frame after reducing model formula

问题 I am working a phylogenetic multiple regression using the caper package on Windows 7, and am receiving a Model frame / formula mismatch error consistently when ever I try to graph a residual leverage plot after generating a reduced model. Here is the minimal code needed to reproduce the error: g <- Response ~ (Name1 + Name2 + Name3 + Name4 + Name5 + Name6 + Name7)^2 + Name1Sqd + Name2Sqd + Name3Sqd + Name4Sqd + Name5Sqd + Name6Sqd + Name7Sqd crunchMod <- crunch(g, data = contrasts) plot

Extracting final p-value from output of regression (lm) in R [duplicate]

阅读更多关于 Extracting final p-value from output of regression (lm) in R [duplicate]

问题 This question already has answers here : pull out p-values and r-squared from a linear regression (12 answers) Closed 4 years ago . I have following data and code: > res = lm(vnum1~vnum2+vch1, data=rndf) > sumres=summary(res) > > sumres Call: lm(formula = vnum1 ~ vnum2 + vch1, data = rndf) Residuals: Min 1Q Median 3Q Max -1.48523 -0.42050 0.05919 0.43710 1.93554 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.0265 1.0192 -1.007 0.3310 vnum2 1.9538 0.9665 2.022 0.0628 . vch1B

LMS batch gradient descent with NumPy

阅读更多关于 LMS batch gradient descent with NumPy

问题 I'm trying to write some very simple LMS batch gradient descent but I believe I'm doing something wrong with the gradient. The ratio between the order of magnitude and the initial values for theta is very different for the elements of theta so either theta[2] doesn't move (e.g. if alpha = 1e-8 ) or theta[1] shoots off (e.g. if alpha = .01 ). import numpy as np y = np.array([[400], [330], [369], [232], [540]]) x = np.array([[2104,3], [1600,3], [2400,3], [1416,2], [3000,4]]) x = np.concatenate(

Model Prediction for pooled regression model in panel data

阅读更多关于 Model Prediction for pooled regression model in panel data

问题 I'm trying to produce a predictive model where i performed multiple pooled regressions in each year (based on previous years) and thus allow coefficients to vary across time. (This might not make sense in the sample data provided, but it is done in practice for my sample). Here is what I came up so far: I adjusted my code to a reproducible sample from the plm package: The data is structured in the following way (panel) with firm, year indexed. > head(Grunfeld) firm year inv value capital 1 1

Linear Regression calculation several times in one dataframe

阅读更多关于 Linear Regression calculation several times in one dataframe

问题 I am using R to evaluate climate data and I have a data set that looks like the following miniaturized version... please forgive my crude posting etiquette, I hope this post is understandable. [0][STA.NAME] [YEAR] [SUM.CDD] 1 NAME1 1967 760 2 NAME1 1985 800 3 NAME1 1996 740 4 NAME1 2003 810 5 NAME1 2011 790 6 NAME2 1967 700 7 NAME2 1985 690 8 NAME2 1996 850 9 NAME2 2003 790 10 NAME3 1967 760 11 NAME3 1985 800 12 NAME3 1990 740 13 NAME3 1996 810 14 NAME3 2003 790 15 NAME3 2011 800 I am trying

Run lm with multiple responses and weights

阅读更多关于 Run lm with multiple responses and weights

问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I have to fit a linear model with the same model matrix to multiple responses. This can be easily done in R by specifying the response as matrix instead of a vector. Computation is very fast in this way. Now I would also like to add weights to the model that correspond to the accuracy of responses. Therefore, for each response vector I would need also different weight vector.

How to find out the slope value by applying linear regression on trend of a data?

阅读更多关于 How to find out the slope value by applying linear regression on trend of a data?

问题 I have a time series data from which I am able to find out the trend .Now I need to put a regression line which fits the best for the trend data and would like the know whether the slope is +ve or -ve or constant.Below is my csv file which contains the data date,cpu 2018-02-10 11:52:59.342269+00:00,6.0 2018-02-10 11:53:04.006971+00:00,6.0 2018-02-10 22:35:33.438948+00:00,4.0 2018-02-10 22:35:37.905242+00:00,4.0 2018-02-11 12:01:00.663084+00:00,4.0 2018-02-11 12:01:05.136107+00:00,4.0 2018-02

Linear Regression vs Closed form Ordinary least squares in Python

阅读更多关于 Linear Regression vs Closed form Ordinary least squares in Python

问题 I am trying to apply Linear Regression method for a dataset of 9 sample with around 50 features using python. I have tried different methodology for Linear Regression i.e Closed form OLS(Ordinary Least Squares), LR(Linear Regression), HR(Huber Regression), NNLS( Non negative least squares) and each of them gives different weights. But I can get the intuition why HR and NNLS has different solution, but LR and Closed form OLS have the same objective function of minimizing the sum of the squares

How to match a data frame of variable names and another with data for a regression?

阅读更多关于 How to match a data frame of variable names and another with data for a regression?

问题 I have two data frames: x = data.frame(Var1= c("A", "B", "C", "D","E"),Var2=c("F","G","H","I","J"), Value= c(11, 12, 13, 14,18)) y = data.frame(A= c(11, 12, 13, 14,18), B= c(15, 16, 17, 14,18),C= c(17, 22, 23, 24,18), D= c(11, 12, 13, 34,18),E= c(11, 5, 13, 55,18), F= c(8, 12, 13, 14,18),G= c(7, 5, 13, 14,18), H= c(8, 12, 13, 14,18), I= c(9, 5, 13, 14,18), J= c(11, 12, 13, 14,18)) Var3 <- rep("time", each=length(x$Var1)) x=cbind(x,Var3) time=seq(1:length(y[,1])) y=cbind(y,time) > x Var1 Var2