linear-regression | 易学教程

Linear regression with matplotlib / numpy

阅读更多关于 Linear regression with matplotlib / numpy

I'm trying to generate a linear regression on a scatter plot I have generated, however my data is in list format, and all of the examples I can find of using polyfit require using arange . arange doesn't accept lists though. I have searched high and low about how to convert a list to an array and nothing seems clear. Am I missing something? Following on, how best can I use my list of integers as inputs to the polyfit ? here is the polyfit example I am following: from pylab import * x = arange(data) y = arange(data) m,b = polyfit(x, y, 1) plot(x, y, 'yo', x, m*x+b, '--k') show() DSM arange

lme4::lmer reports “fixed-effect model matrix is rank deficient”, do I need a fix and how to?

阅读更多关于 lme4::lmer reports “fixed-effect model matrix is rank deficient”, do I need a fix and how to?

I am trying to run a mixed-effects model that predicts F2_difference with the rest of the columns as predictors, but I get an error message that says fixed-effect model matrix is rank deficient so dropping 7 columns / coefficients. From this link, Fixed-effects model is rank deficient , I think I should use findLinearCombos in the R package caret . However, when I try findLinearCombos(data.df) , it gives me the error message Error in qr.default(object) : NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: In qr.default(object) : NAs introduced by coercion My data does not

Why does lm run out of memory while matrix multiplication works fine for coefficients?

阅读更多关于 Why does lm run out of memory while matrix multiplication works fine for coefficients?

问题 I am trying to do fixed effects linear regression with R. My data looks like dte yr id v1 v2 . . . . . . . . . . . . . . . I then decided to simply do this by making yr a factor and use lm : lm(v1 ~ factor(yr) + v2 - 1, data = df) However, this seems to run out of memory. I have 20 levels in my factor and df is 14 million rows which takes about 2GB to store, I am running this on a machine with 22 GB dedicated to this process. I then decided to try things the old fashioned way: create dummy

Linear Regression and storing results in data frame [duplicate]

阅读更多关于 Linear Regression and storing results in data frame [duplicate]

问题 This question already has an answer here: Linear Regression and group by in R 10 answers I am running a linear regression on some variables in a data frame. I'd like to be able to subset the linear regressions by a categorical variable, run the linear regression for each categorical variable, and then store the t-stats in a data frame. I'd like to do this without a loop if possible. Here's a sample of what I'm trying to do: a<- c("a","a","a","a","a", "b","b","b","b","b", "c","c","c","c","c")

How to return predicted values,residuals,R square from lm.fit in R?

阅读更多关于 How to return predicted values,residuals,R square from lm.fit in R?

问题 this piece of code will return coefficients :intercept , slop1 , slop2 set.seed(1) n=10 y=rnorm(n) x1=rnorm(n) x2=rnorm(n) lm.ft=function(y,x1,x2) return(lm(y~x1+x2)$coef) res=list(); for(i in 1:n){ x1.bar=x1-x1[i] x2.bar=x2-x2[i] res[[i]]=lm.ft(y,x1.bar,x2.bar) } If I type: > res[[1]] I get: (Intercept) x1 x2 -0.44803887 0.06398476 -0.62798646 How can we return predicted values,residuals,R square, ..etc? I need something general to extract whatever I need from the summary? 回答1: There are a

Accuracy Score ValueError: Can't Handle mix of binary and continuous target

阅读更多关于 Accuracy Score ValueError: Can't Handle mix of binary and continuous target

I'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the accuracy_score metric. This is my true Data : array([1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0]) My predicted Data: array([ 0.07094605, 0.1994941 , 0.19270157, 0.13379635, 0.04654469, 0.09212494, 0.19952108, 0.12884365, 0.15685076, -0.01274453, 0.32167554, 0.32167554, -0.10023553, 0.09819648, -0.06755516, 0.25390082, 0.17248324]) My code: accuracy_score(y_true, y_pred, normalize=False) Error message: ValueError: Can't

Loop linear regression and saving coefficients

阅读更多关于 Loop linear regression and saving coefficients

问题 This is part of the dataset (named "ME1") I'm using (all variables are numeric): Year AgeR rateM 1 1751 -1.0 0.241104596 2 1751 -0.9 0.036093609 3 1751 -0.8 0.011623734 4 1751 -0.7 0.006670552 5 1751 -0.6 0.006610552 6 1751 -0.5 0.008510828 7 1751 -0.4 0.009344041 8 1751 -0.3 0.011729740 9 1751 -0.2 0.010988005 10 1751 -0.1 0.015896107 11 1751 0.0 0.018190140 12 1751 0.1 0.024588340 13 1751 0.2 0.029801362 14 1751 0.3 0.044515912 15 1751 0.4 0.055240354 16 1751 0.5 0.088476758 17 1751 0.6 0

lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK

阅读更多关于 lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK

问题 rich.main3 is a linear model in R. I understand the rest of the elements of the list but I don't get what qraux is. The documentation states that it is a vector of length ncol(x) which contains additional information on \bold{Q}". What additional information does it mean? str(rich.main3$qr) qr : num [1:164, 1:147] -12.8062 0.0781 0.0781 0.0781 0.0781 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:164] "1" "2" "3" "4" ... .. ..$ : chr [1:147] "(Intercept)" "S2" "S3" "x1" ... ..- attr(*,

predict.lm() with an unknown factor level in test data

阅读更多关于 predict.lm() with an unknown factor level in test data

I am fitting a model to factor data and predicting. If the newdata in predict.lm() contains a single factor level that is unknown to the model, all of predict.lm() fails and returns an error. Is there a good way to have predict.lm() return a prediction for those factor levels the model knows and NA for unknown factor levels, instead of only an error? Example code: foo <- data.frame(response=rnorm(3),predictor=as.factor(c("A","B","C"))) model <- lm(response~predictor,foo) foo.new <- data.frame(predictor=as.factor(c("A","B","C","D"))) predict(model,newdata=foo.new) I would like the very last

R: lm() result differs when using `weights` argument and when using manually reweighted data

阅读更多关于 R: lm() result differs when using `weights` argument and when using manually reweighted data

问题 In order to correct heteroskedasticity in error terms, I am running the following weighted least squares regression in R : #Call: #lm(formula = a ~ q + q2 + b + c, data = mydata, weights = weighting) #Weighted Residuals: # Min 1Q Median 3Q Max #-1.83779 -0.33226 0.02011 0.25135 1.48516 #Coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -3.939440 0.609991 -6.458 1.62e-09 *** #q 0.175019 0.070101 2.497 0.013696 * #q2 0.048790 0.005613 8.693 8.49e-15 *** #b 0.473891 0.134918 3