linear-regression

how to create many linear models at once and put the coefficients into a new matrix?

╄→гoц情女王★ 提交于 2020-01-11 07:51:44
问题 I have 365 columns. In each column I have 60 values. I need to know the rate of change over time for each column (slope or linear coefficient). I created a generic column as a series of numbers from 1:60 to represent the 60 corresponding time intervals. I want to create 356 linear regression models using the generic time stamp column with each of the 365 columns of data. In other words, I have many columns and I would like to create many linear regression models at once, extract the

Efficient Cointegration Test in Python

冷暖自知 提交于 2020-01-10 07:58:20
问题 I am wondering if there is a better way to test if two variables are cointegrated than the following method: import numpy as np import statsmodels.api as sm import statsmodels.tsa.stattools as ts y = np.random.normal(0,1, 250) x = np.random.normal(0,1, 250) def cointegration_test(y, x): # Step 1: regress on variable on the other ols_result = sm.OLS(y, x).fit() # Step 2: obtain the residual (ols_resuld.resid) # Step 3: apply Augmented Dickey-Fuller test to see whether # the residual is unit

R: multiple linear regression model and prediction model

会有一股神秘感。 提交于 2020-01-09 12:04:33
问题 Starting from a linear model1 = lm(temp~alt+sdist) i need to develop a prediction model, where new data will come in hand and predictions about temp will be made. I have tried doing something like this: model2 = predict.lm(model1, newdata=newdataset) However, I am not sure this is the right way. What I would like to know here is, if this is the right way to go in order to make prediction about temp . Also I am a bit confused when it comes to the newdataset . Which values should be filled in

R: multiple linear regression model and prediction model

℡╲_俬逩灬. 提交于 2020-01-09 12:04:09
问题 Starting from a linear model1 = lm(temp~alt+sdist) i need to develop a prediction model, where new data will come in hand and predictions about temp will be made. I have tried doing something like this: model2 = predict.lm(model1, newdata=newdataset) However, I am not sure this is the right way. What I would like to know here is, if this is the right way to go in order to make prediction about temp . Also I am a bit confused when it comes to the newdataset . Which values should be filled in

Different model performance evaluations by statsmodels and scikit-learn

做~自己de王妃 提交于 2020-01-07 09:22:09
问题 I am trying to fit a multivariable linear regression on a dataset to find out how well the model explains the data. My predictors have 120 dimensions and I have 177 samples: X.shape=(177,120), y.shape=(177,) Using statsmodels, I get a very good R-squared of 0.76 with a Prob(F-statistic) of 0.06 which trends towards significance and indicates a good model for the data. When I use scikit-learn's linear regression and try to compute 5-fold cross validation r2 score, I get an average r2 score of

Different Linear Regression Coefficients with statsmodels and sklearn

戏子无情 提交于 2020-01-07 05:58:05
问题 I was planning to use sklearn linear_model to plot a graph of linear regression result, and statsmodels.api to get a detail summary of the learning result. However, the two packages produce very different results on the same input. For example, the constant term from sklearn is 7.8e-14, but the constant term from statsmodels is 48.6. (I added a column of 1's in x for constant term when using both methods) My code for both methods are succint: # Use statsmodels linear regression to get a

How to extrapolate simple linear regression and get errors for the coefficients in Python?

走远了吗. 提交于 2020-01-06 07:05:41
问题 Here is my sample data: x = np.array([19.0, 47.0, 34.6, 23.2, 33.5, 28.2,34.8, 15.8, 23.8]) y = np.array([6.12,3.55, 2.67, 2.81, 5.34, 3.75,3.43, 1.44, 0.84]) pl.scatter(x,y, facecolors='b', edgecolors='b', s=24) x = x[:,np.newaxis] a, _, _, _ = np.linalg.lstsq(x, y) pl.plot(x, a*x, 'r-') pl.xlim(0,50) pl.ylim(0,7) You can see in the plot that the linear fit does not reach y=0. How can I find the x-value (i.e. extrapolate the data) at which y=0? And is there a way to get do an error

How to extrapolate simple linear regression and get errors for the coefficients in Python?

雨燕双飞 提交于 2020-01-06 07:03:10
问题 Here is my sample data: x = np.array([19.0, 47.0, 34.6, 23.2, 33.5, 28.2,34.8, 15.8, 23.8]) y = np.array([6.12,3.55, 2.67, 2.81, 5.34, 3.75,3.43, 1.44, 0.84]) pl.scatter(x,y, facecolors='b', edgecolors='b', s=24) x = x[:,np.newaxis] a, _, _, _ = np.linalg.lstsq(x, y) pl.plot(x, a*x, 'r-') pl.xlim(0,50) pl.ylim(0,7) You can see in the plot that the linear fit does not reach y=0. How can I find the x-value (i.e. extrapolate the data) at which y=0? And is there a way to get do an error

Looping through columns in R

自作多情 提交于 2020-01-06 03:26:34
问题 I am trying to run a linear regression on each variable relative to x x y1 y2 y3 This is the code i am using gen <-read.table("CH0032_time_soma.out",sep = "\t",header=TRUE) dat<-gen[,c(1,3:1131)] dat_y<-(dat[,c(2:1130)]) dat_x<-(dat[,c(1)]) for(i in names(dat_y)){ model = lm(i~dat_x,dat) } I keep getting this error Error in model.frame.default(formula = i ~ dat_x, data = dat, drop.unused.levels = TRUE) : invalid type (list) for variable 'dat_x' Calls: lm -> eval -> eval -> <Anonymous> ->

How do I change colours of confidence interval lines when using `matlines` for prediction plot?

寵の児 提交于 2020-01-05 05:57:42
问题 I'm plotting a logarithmic regression's line of best fit as well as the confidence intervals around that line. The code I'm using works well enough, except I'd rather that the confidence intervals both be "gray" (rather than the default "red" and "green"). Unfortunately, I'm not seeing a way to isolate them when specifying colour changes. I'd like for the regression line: lty = 1, col = "black" ; for confidence intervals to have: lty=2, col = "gray" . How can I achieve this? my code is of the