linear-regression | 易学教程

how to create many linear models at once and put the coefficients into a new matrix?

阅读更多关于 how to create many linear models at once and put the coefficients into a new matrix?

问题 I have 365 columns. In each column I have 60 values. I need to know the rate of change over time for each column (slope or linear coefficient). I created a generic column as a series of numbers from 1:60 to represent the 60 corresponding time intervals. I want to create 356 linear regression models using the generic time stamp column with each of the 365 columns of data. In other words, I have many columns and I would like to create many linear regression models at once, extract the

Efficient Cointegration Test in Python

阅读更多关于 Efficient Cointegration Test in Python

问题 I am wondering if there is a better way to test if two variables are cointegrated than the following method: import numpy as np import statsmodels.api as sm import statsmodels.tsa.stattools as ts y = np.random.normal(0,1, 250) x = np.random.normal(0,1, 250) def cointegration_test(y, x): # Step 1: regress on variable on the other ols_result = sm.OLS(y, x).fit() # Step 2: obtain the residual (ols_resuld.resid) # Step 3: apply Augmented Dickey-Fuller test to see whether # the residual is unit

R: multiple linear regression model and prediction model

阅读更多关于 R: multiple linear regression model and prediction model

问题 Starting from a linear model1 = lm(temp~alt+sdist) i need to develop a prediction model, where new data will come in hand and predictions about temp will be made. I have tried doing something like this: model2 = predict.lm(model1, newdata=newdataset) However, I am not sure this is the right way. What I would like to know here is, if this is the right way to go in order to make prediction about temp . Also I am a bit confused when it comes to the newdataset . Which values should be filled in

R: multiple linear regression model and prediction model

阅读更多关于 R: multiple linear regression model and prediction model

Different model performance evaluations by statsmodels and scikit-learn

阅读更多关于 Different model performance evaluations by statsmodels and scikit-learn

问题 I am trying to fit a multivariable linear regression on a dataset to find out how well the model explains the data. My predictors have 120 dimensions and I have 177 samples: X.shape=(177,120), y.shape=(177,) Using statsmodels, I get a very good R-squared of 0.76 with a Prob(F-statistic) of 0.06 which trends towards significance and indicates a good model for the data. When I use scikit-learn's linear regression and try to compute 5-fold cross validation r2 score, I get an average r2 score of

Different Linear Regression Coefficients with statsmodels and sklearn

阅读更多关于 Different Linear Regression Coefficients with statsmodels and sklearn

问题 I was planning to use sklearn linear_model to plot a graph of linear regression result, and statsmodels.api to get a detail summary of the learning result. However, the two packages produce very different results on the same input. For example, the constant term from sklearn is 7.8e-14, but the constant term from statsmodels is 48.6. (I added a column of 1's in x for constant term when using both methods) My code for both methods are succint: # Use statsmodels linear regression to get a

How to extrapolate simple linear regression and get errors for the coefficients in Python?

阅读更多关于 How to extrapolate simple linear regression and get errors for the coefficients in Python?

问题 Here is my sample data: x = np.array([19.0, 47.0, 34.6, 23.2, 33.5, 28.2,34.8, 15.8, 23.8]) y = np.array([6.12,3.55, 2.67, 2.81, 5.34, 3.75,3.43, 1.44, 0.84]) pl.scatter(x,y, facecolors='b', edgecolors='b', s=24) x = x[:,np.newaxis] a, _, _, _ = np.linalg.lstsq(x, y) pl.plot(x, a*x, 'r-') pl.xlim(0,50) pl.ylim(0,7) You can see in the plot that the linear fit does not reach y=0. How can I find the x-value (i.e. extrapolate the data) at which y=0? And is there a way to get do an error

How to extrapolate simple linear regression and get errors for the coefficients in Python?

阅读更多关于 How to extrapolate simple linear regression and get errors for the coefficients in Python?

Looping through columns in R

阅读更多关于 Looping through columns in R

问题 I am trying to run a linear regression on each variable relative to x x y1 y2 y3 This is the code i am using gen <-read.table("CH0032_time_soma.out",sep = "\t",header=TRUE) dat<-gen[,c(1,3:1131)] dat_y<-(dat[,c(2:1130)]) dat_x<-(dat[,c(1)]) for(i in names(dat_y)){ model = lm(i~dat_x,dat) } I keep getting this error Error in model.frame.default(formula = i ~ dat_x, data = dat, drop.unused.levels = TRUE) : invalid type (list) for variable 'dat_x' Calls: lm -> eval -> eval -> <Anonymous> ->

How do I change colours of confidence interval lines when using `matlines` for prediction plot?

阅读更多关于 How do I change colours of confidence interval lines when using `matlines` for prediction plot?

问题 I'm plotting a logarithmic regression's line of best fit as well as the confidence intervals around that line. The code I'm using works well enough, except I'd rather that the confidence intervals both be "gray" (rather than the default "red" and "green"). Unfortunately, I'm not seeing a way to isolate them when specifying colour changes. I'd like for the regression line: lty = 1, col = "black" ; for confidence intervals to have: lty=2, col = "gray" . How can I achieve this? my code is of the