linear-regression

Get number of data in each factor level (as well as interaction) from a fitted lm or glm [R]

社会主义新天地 提交于 2019-12-11 14:09:09
问题 I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous (in addition to the response variable, which is also obviously categorical/binary). When calling summary(model_name) , is there a way to include a column representing the number of observations within each factor level? 回答1: I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous. If all your covariates are factors

R: How to update model frame after reducing model formula

旧时模样 提交于 2019-12-11 11:37:12
问题 I am working a phylogenetic multiple regression using the caper package on Windows 7, and am receiving a Model frame / formula mismatch error consistently when ever I try to graph a residual leverage plot after generating a reduced model. Here is the minimal code needed to reproduce the error: g <- Response ~ (Name1 + Name2 + Name3 + Name4 + Name5 + Name6 + Name7)^2 + Name1Sqd + Name2Sqd + Name3Sqd + Name4Sqd + Name5Sqd + Name6Sqd + Name7Sqd crunchMod <- crunch(g, data = contrasts) plot

Extracting final p-value from output of regression (lm) in R [duplicate]

*爱你&永不变心* 提交于 2019-12-11 11:37:06
问题 This question already has answers here : pull out p-values and r-squared from a linear regression (12 answers) Closed 4 years ago . I have following data and code: > res = lm(vnum1~vnum2+vch1, data=rndf) > sumres=summary(res) > > sumres Call: lm(formula = vnum1 ~ vnum2 + vch1, data = rndf) Residuals: Min 1Q Median 3Q Max -1.48523 -0.42050 0.05919 0.43710 1.93554 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.0265 1.0192 -1.007 0.3310 vnum2 1.9538 0.9665 2.022 0.0628 . vch1B

LMS batch gradient descent with NumPy

你。 提交于 2019-12-11 09:45:33
问题 I'm trying to write some very simple LMS batch gradient descent but I believe I'm doing something wrong with the gradient. The ratio between the order of magnitude and the initial values for theta is very different for the elements of theta so either theta[2] doesn't move (e.g. if alpha = 1e-8 ) or theta[1] shoots off (e.g. if alpha = .01 ). import numpy as np y = np.array([[400], [330], [369], [232], [540]]) x = np.array([[2104,3], [1600,3], [2400,3], [1416,2], [3000,4]]) x = np.concatenate(

Model Prediction for pooled regression model in panel data

旧巷老猫 提交于 2019-12-11 09:07:59
问题 I'm trying to produce a predictive model where i performed multiple pooled regressions in each year (based on previous years) and thus allow coefficients to vary across time. (This might not make sense in the sample data provided, but it is done in practice for my sample). Here is what I came up so far: I adjusted my code to a reproducible sample from the plm package: The data is structured in the following way (panel) with firm, year indexed. > head(Grunfeld) firm year inv value capital 1 1

Linear Regression calculation several times in one dataframe

微笑、不失礼 提交于 2019-12-11 07:56:38
问题 I am using R to evaluate climate data and I have a data set that looks like the following miniaturized version... please forgive my crude posting etiquette, I hope this post is understandable. [0][STA.NAME] [YEAR] [SUM.CDD] 1 NAME1 1967 760 2 NAME1 1985 800 3 NAME1 1996 740 4 NAME1 2003 810 5 NAME1 2011 790 6 NAME2 1967 700 7 NAME2 1985 690 8 NAME2 1996 850 9 NAME2 2003 790 10 NAME3 1967 760 11 NAME3 1985 800 12 NAME3 1990 740 13 NAME3 1996 810 14 NAME3 2003 790 15 NAME3 2011 800 I am trying

Run lm with multiple responses and weights

眉间皱痕 提交于 2019-12-11 07:39:02
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I have to fit a linear model with the same model matrix to multiple responses. This can be easily done in R by specifying the response as matrix instead of a vector. Computation is very fast in this way. Now I would also like to add weights to the model that correspond to the accuracy of responses. Therefore, for each response vector I would need also different weight vector.

How to find out the slope value by applying linear regression on trend of a data?

▼魔方 西西 提交于 2019-12-11 07:35:16
问题 I have a time series data from which I am able to find out the trend .Now I need to put a regression line which fits the best for the trend data and would like the know whether the slope is +ve or -ve or constant.Below is my csv file which contains the data date,cpu 2018-02-10 11:52:59.342269+00:00,6.0 2018-02-10 11:53:04.006971+00:00,6.0 2018-02-10 22:35:33.438948+00:00,4.0 2018-02-10 22:35:37.905242+00:00,4.0 2018-02-11 12:01:00.663084+00:00,4.0 2018-02-11 12:01:05.136107+00:00,4.0 2018-02

Linear Regression vs Closed form Ordinary least squares in Python

。_饼干妹妹 提交于 2019-12-11 07:07:23
问题 I am trying to apply Linear Regression method for a dataset of 9 sample with around 50 features using python. I have tried different methodology for Linear Regression i.e Closed form OLS(Ordinary Least Squares), LR(Linear Regression), HR(Huber Regression), NNLS( Non negative least squares) and each of them gives different weights. But I can get the intuition why HR and NNLS has different solution, but LR and Closed form OLS have the same objective function of minimizing the sum of the squares

How to match a data frame of variable names and another with data for a regression?

与世无争的帅哥 提交于 2019-12-11 07:05:10
问题 I have two data frames: x = data.frame(Var1= c("A", "B", "C", "D","E"),Var2=c("F","G","H","I","J"), Value= c(11, 12, 13, 14,18)) y = data.frame(A= c(11, 12, 13, 14,18), B= c(15, 16, 17, 14,18),C= c(17, 22, 23, 24,18), D= c(11, 12, 13, 34,18),E= c(11, 5, 13, 55,18), F= c(8, 12, 13, 14,18),G= c(7, 5, 13, 14,18), H= c(8, 12, 13, 14,18), I= c(9, 5, 13, 14,18), J= c(11, 12, 13, 14,18)) Var3 <- rep("time", each=length(x$Var1)) x=cbind(x,Var3) time=seq(1:length(y[,1])) y=cbind(y,time) > x Var1 Var2