linear-regression

Linear Regression with Python numpy

☆樱花仙子☆ 提交于 2019-12-21 12:07:36
问题 I'm trying to make a simple linear regression function but continue to encounter a numpy.linalg.linalg.LinAlgError: Singular matrix error Existing function (with debug prints): def makeLLS(inputData, targetData): print "In makeLLS:" print " Shape inputData:",inputData.shape print " Shape targetData:",targetData.shape term1 = np.dot(inputData.T, inputData) term2 = np.dot(inputData.T, targetData) print " Shape term1:",term1.shape print " Shape term2:",term2.shape #print term1 #print term2

how to get the slope of a linear regression line using c++?

萝らか妹 提交于 2019-12-21 11:33:50
问题 I need to attain the slop of a linear regression similar to the way the excel function in the below link is implemented: http://office.microsoft.com/en-gb/excel-help/slope-function-HP010342903.aspx Is there a library in C++ or a simple coded solution someone has created which can do this? I have implemented code according to this formula, however it does not always give me the correct results (taken from here http://easycalculation.com/statistics/learn-regression.php) .... Slope(b) = (NΣXY -

Interpreting Alias table testing multicollinearity of model in R

巧了我就是萌 提交于 2019-12-21 06:21:24
问题 Could someone help me interpret the alias function output for testing for multicollinearity in a multiple regression model. I know some predictor variables in my model are highly correlated, and I want to identify them using the alias table. Model : Score ~ Comments + Pros + Cons + Advice + Response + Value + Recommendation + 6Months + 12Months + 2Years + 3Years + Daily + Weekly + Monthly Complete : (Intercept) Comments Pros Cons Advice Response Value1 UseMonthly1 0 0 0 0 0 0 0

Equations for 2 variable Linear Regression

╄→尐↘猪︶ㄣ 提交于 2019-12-21 06:07:50
问题 We are using a programming language that does not have a linear regression function in it. We have already implemented a single variable linear equation: y = Ax + B and have simply calculated the A and B coefficents from the data using a solution similar to this Stack Overflow answer. I know this problem gets geometrically harder as variables are added, but for our purposes, we only need to add one more: z = Ax + By + C Does anyone have the closed form equations, or code in any language that

Pandas/Statsmodel OLS predicting future values

不打扰是莪最后的温柔 提交于 2019-12-21 05:47:17
问题 I've been trying to get a prediction for future values in a model I've created. I have tried both OLS in pandas and statsmodels. Here is what I have in statsmodels: import statsmodels.api as sm endog = pd.DataFrame(dframe['monthly_data_smoothed8']) smresults = sm.OLS(dframe['monthly_data_smoothed8'], dframe['date_delta']).fit() sm_pred = smresults.predict(endog) sm_pred The length of the array returned is equal to the number of records in my original dataframe but the values are not the same.

python linear regression predict by date

独自空忆成欢 提交于 2019-12-21 04:45:08
问题 I want to predict a value at a date in the future with simple linear regression, but I can't due to the date format. This is the dataframe I have: data_df = date value 2016-01-15 1555 2016-01-16 1678 2016-01-17 1789 ... y = np.asarray(data_df['value']) X = data_df[['date']] X_train, X_test, y_train, y_test = train_test_split (X,y,train_size=.7,random_state=42) model = LinearRegression() #create linear regression object model.fit(X_train, y_train) #train model on train data model.score(X_train

how to check for correlation among continuous and categorical variables in python?

南笙酒味 提交于 2019-12-21 04:32:39
问题 I have a dataset including categorical variables(binary) and continuous variables. I'm trying to apply a linear regression model for predicting a continuous variable. Can someone please let me know how to check for correlation among the categorical variables and the continuous target variable. Current Code: import pandas as pd df_hosp = pd.read_csv('C:\Users\LAPPY-2\Desktop\LengthOfStay.csv') data = df_hosp[['lengthofstay', 'male', 'female', 'dialysisrenalendstage', 'asthma', \ 'irondef',

What is the most accurate method in python for computing the minimum norm solution or the solution obtained from the pseudo-inverse?

筅森魡賤 提交于 2019-12-20 17:42:33
问题 My goal is to solve: Kc=y with the pseudo-inverse (i.e. minimum norm solution ): c=K^{+}y such that the model is (hopefully) high degree polynomial model f(x) = sum_i c_i x^i . I am specially interested in the underdetermined case where we have more polynomial features than data (few equation too many variables/unknowns) columns = deg+1 > N = rows . Note K is the vandermode matrix of polynomial features. I was initially using the python function np.linalg.pinv but then I noticed something

Running multiple, simple linear regressions from dataframe in R

馋奶兔 提交于 2019-12-20 14:15:38
问题 I have a dataset (data frame) with 5 columns all containing numeric values. I'm looking to run a simple linear regression for each pair in the dataset. For example, If the columns were named A, B, C, D, E , I want to run lm(A~B), lm(A~C), lm(A~D), ...., lm(D~E) ,... and, then I want to plot the data for each pair along with the regression line. I'm pretty new to R so I'm sort of spinning my wheels on how to actually accomplish this. Should I use ddply ? or lapply ? I'm not really sure how to

Running multiple, simple linear regressions from dataframe in R

风流意气都作罢 提交于 2019-12-20 14:14:30
问题 I have a dataset (data frame) with 5 columns all containing numeric values. I'm looking to run a simple linear regression for each pair in the dataset. For example, If the columns were named A, B, C, D, E , I want to run lm(A~B), lm(A~C), lm(A~D), ...., lm(D~E) ,... and, then I want to plot the data for each pair along with the regression line. I'm pretty new to R so I'm sort of spinning my wheels on how to actually accomplish this. Should I use ddply ? or lapply ? I'm not really sure how to