linear-regression

Multivariate multiple linear regression using Sklearn

一曲冷凌霜 提交于 2020-04-13 07:19:45
问题 I want to train a linear model Y = M_1*X_1 + M_2*X_2 using sklearn with multidimensional input and output samples (e.g. vectors). I tried the following code: from sklearn import linear_model from pandas import DataFrame x1 = [[1,2],[2,3],[3,4]] x2 = [[1,1],[3,2],[3,5]] y = [[1,0],[1,2],[2,3]] model = { 'vec1': x1, 'vec2': x2, 'compound_vec': y} df = DataFrame(model, columns=['vec1','vec2','compound_vec']) x = df[['vec1','vec2']].astype(object) y = df['compound_vec'].astype(object) regr =

How to solve several independent time series at the same time using scikit linear regression model

本小妞迷上赌 提交于 2020-04-05 22:00:24
问题 I try to predict multiple independent time series simultaneously using sklearn linear regression model, but I seem not be able to get it right. My data is organised as follow: Xn is a matrix where each row contains a forecast window of 4 observations and yn are the target values for each row of Xn . import numpy as np # training data X1=np.array([[-0.31994,-0.32648,-0.33264,-0.33844],[-0.32648,-0.33264,-0.33844,-0.34393],[-0.33264,-0.33844,-0.34393,-0.34913],[-0.33844,-0.34393,-0.34913,-0

How can I do 3064 regressions using the lapply function

岁酱吖の 提交于 2020-03-25 05:53:30
问题 Hi i am starting to use r and am stuck on analyzing my data. I have a dataframe that has 157 columns. Column 1 is the dependent variable and from column 2 to 157 they are the independent variables, but from column 2 to column 79 it is a type of independent variable (n = 78) and from 80 to 157 another type (n = 78). I want to perform (78 x 78 = 6084) multiple linear regressions leaving the first independent variable of the model fixed one at a time, from columns 2 to 79. I can fix the

How can I do 3064 regressions using the lapply function

混江龙づ霸主 提交于 2020-03-25 05:53:13
问题 Hi i am starting to use r and am stuck on analyzing my data. I have a dataframe that has 157 columns. Column 1 is the dependent variable and from column 2 to 157 they are the independent variables, but from column 2 to column 79 it is a type of independent variable (n = 78) and from 80 to 157 another type (n = 78). I want to perform (78 x 78 = 6084) multiple linear regressions leaving the first independent variable of the model fixed one at a time, from columns 2 to 79. I can fix the

Gradient descent impementation python - contour lines

烈酒焚心 提交于 2020-03-18 05:17:20
问题 As a self study exercise I am trying to implement gradient descent on a linear regression problem from scratch and plot the resulting iterations on a contour plot. My gradient descent implementation gives the correct result (tested with Sklearn) however the gradient descent plot doesn't seem to be perpendicular to the contour lines. Is this expected or did I get something wrong in my code / understanding? Algorithm Cost function and gradient descent import numpy as np import pandas as pd from

Comparing Results from StandardScaler vs Normalizer in Linear Regression

两盒软妹~` 提交于 2020-02-27 04:25:09
问题 I'm working through some examples of Linear Regression under different scenarios, comparing the results from using Normalizer and StandardScaler , and the results are puzzling. I'm using the boston housing dataset, and prepping it this way: import numpy as np import pandas as pd from sklearn.datasets import load_boston from sklearn.preprocessing import Normalizer from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LinearRegression #load the data df = pd.DataFrame

ggplot2; single regression line when colour is coded for by a variable?

你离开我真会死。 提交于 2020-02-24 17:29:19
问题 I am trying to create a scatterplot in ggplot2 with one regression line even though colour is dependent on the 'Survey Type' variable. I would ideally also like to specify which survey type is which colour (community = red, subnational = green, national = blue). This is the code I'm running which currently gives me 3 separate regression lines, one for each survey type. ggplot(data=data.male,aes(x=mid_year, y=mean_tc, colour =condition)) + geom_point(shape=1) + geom_smooth(method=lm, data=data

ggplot2; single regression line when colour is coded for by a variable?

↘锁芯ラ 提交于 2020-02-24 17:28:08
问题 I am trying to create a scatterplot in ggplot2 with one regression line even though colour is dependent on the 'Survey Type' variable. I would ideally also like to specify which survey type is which colour (community = red, subnational = green, national = blue). This is the code I'm running which currently gives me 3 separate regression lines, one for each survey type. ggplot(data=data.male,aes(x=mid_year, y=mean_tc, colour =condition)) + geom_point(shape=1) + geom_smooth(method=lm, data=data

Shaping data for linear regression with TFlearn

不问归期 提交于 2020-02-23 04:39:05
问题 I'm trying to expand the tflearn example for linear regression by increasing the number of columns to 21. from trafficdata import X,Y import tflearn print(X.shape) #(1054, 21) print(Y.shape) #(1054,) # Linear Regression graph input_ = tflearn.input_data(shape=[None,21]) linear = tflearn.single_unit(input_) regression = tflearn.regression(linear, optimizer='sgd', loss='mean_square', metric='R2', learning_rate=0.01) m = tflearn.DNN(regression) m.fit(X, Y, n_epoch=1000, show_metric=True,

AttributeError: module 'statsmodels.formula.api' has no attribute 'OLS'

青春壹個敷衍的年華 提交于 2020-02-14 05:45:48
问题 I am trying to use Ordinary Least Squares for multivariable regression. But it says that there is no attribute 'OLS' from statsmodels. formula. api library. I am following the code from a lecture on Udemy The code is as follows: import statsmodels.formula.api as sm X_opt = X[:,[0,1,2,3,4,5]] #OrdinaryLeastSquares regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit( The error is as follows: AttributeError Traceback (most recent call last) <ipython-input-19-3bdb0bc861c6> in <module>() 2 X_opt =