regression

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

╄→尐↘猪︶ㄣ 提交于 2019-12-06 00:52:57
问题 I have the following code for minimizing the sum of deviation using optim() to find beta0 and beta1 but I am receiving the following errors I am not sure what I am doing wrong: sum.abs.dev<-function(beta=c(beta0,beta1),a,b) { total<-0 n<-length(b) for (i in 1:n) { total <- total + (b[i]-beta[1]-beta[2]*a[i]) } return(total) } tlad <- function(y = "farm", x = "land", data="FarmLandArea.csv") { dat <- read.csv(data) #fit<-lm(dat$farm~dat$land) fit<-lm(y~x,data=dat) beta.out=optim(fit

How to find multivariable regression equation in javascript

一笑奈何 提交于 2019-12-05 22:59:14
问题 I have searched stack overflow and have not found any question that really is the same as mine because none really have more than one independent variable. Basically I have an array of datapoints and I want to be able to find a regression equation for those data points. The code I have so far looks like this: (w,x,z are the independent variables and y is the dependent variable) var dataPoints = [{ "w" : 1, "x" : 2, "z" : 1, "y" : 7 }, { "w" : 2, "x" : 1, "z" : 4, "y" : 5 }, { "w" : 1, "x" : 5

regression model evaluation using scikit-learn

江枫思渺然 提交于 2019-12-05 20:13:35
I am doing regression with sklearn and use random grid search to evaluate different parameters. Here is a toy example: from sklearn.datasets import make_regression from sklearn.metrics import mean_squared_error, make_scorer from scipy.stats import randint as sp_randint from sklearn.ensemble import ExtraTreesRegressor from sklearn.cross_validation import LeaveOneOut from sklearn.grid_search import GridSearchCV, RandomizedSearchCV X, y = make_regression(n_samples=10, n_features=10, n_informative=3, random_state=0, shuffle=False) clf = ExtraTreesRegressor(random_state=12) param_dist = {"n

partial correlation coefficient in pandas dataframe python

扶醉桌前 提交于 2019-12-05 18:00:57
I have a data in pandas dataframe like: df = X1 X2 X3 Y 0 1 2 10 5.077 1 2 2 9 32.330 2 3 3 5 65.140 3 4 4 4 47.270 4 5 2 9 80.570 and I want to do multiple regression analysis. Here Y is dependent variables and x1, x2 and x3 are independent variables. correlation between each independent variables with dependent variable is: df.corr(): X1 X2 X3 Y X1 1.000000 0.353553 -0.409644 0.896626 X2 0.353553 1.000000 -0.951747 0.204882 X3 -0.409644 -0.951747 1.000000 -0.389641 Y 0.896626 0.204882 -0.389641 1.000000 ​As we can see here y has highest correlation with x1 so i have selected x1 as first

Using categorical data as features in sklean LogisticRegression

我的未来我决定 提交于 2019-12-05 17:49:21
问题 I'm trying to understand how to use categorical data as features in sklearn.linear_model 's LogisticRegression . I understand of course I need to encode it. What I don't understand is how to pass the encoded feature to the Logistic regression so it's processed as a categorical feature, and not interpreting the int value it got when encoding as a standard quantifiable feature. (Less important) Can somebody explain the difference between using preprocessing.LabelEncoder() , DictVectorizer

Arima/Arma Time series Models in Java [closed]

[亡魂溺海] 提交于 2019-12-05 16:46:38
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . I am looking for an Arima time series models in java. Is there any Java library implementing Arima/Arma model? 回答1: Googling should help :) I got this from Google: Java ARMA model, simulation and fitting Java ARIMA model 回答2: Hi Please refer to Apache Math Library for other forms of Regression, like Simple, OLS,

Best fit plane by minimizing orthogonal distances

孤街醉人 提交于 2019-12-05 15:59:11
I have a set of points (in the form x1,y1,z1 ... xn,yn,zn ) obtained from a surface mesh. I want to find the best-fit 3D plane to these points by minimizing orthogonal distances. x,y,z coordinates are independent, that is I want to obtain the coefficient A, B, C, D for the plane equation Ax + By + Cz + D = 0. What would be the algorithm to obtain A, B, C, D? Note: in a previous post it was discussed the best-fit plane in a least squares sense, by considering the z coordinate a linear function of x,y . However this is not my case. From memory, this turns into an eigenvector problem. The

“length of 'dimnames' [1] not equal to array extent” error in linear regression summary in r

ε祈祈猫儿з 提交于 2019-12-05 15:57:13
I'm running a straightforward linear regression model fit on the following dataframe: > str(model_data_rev) 'data.frame': 128857 obs. of 12 variables: $ ENTRY_4 : num 186 218 208 235 256 447 471 191 207 250 ... $ ENTRY_8 : num 724 769 791 777 707 237 236 726 773 773 ... $ ENTRY_12: num 2853 2989 3174 3027 3028 ... $ ENTRY_16: num 2858 3028 3075 2992 3419 ... $ ENTRY_20: num 7260 7188 7587 7560 7165 ... $ EXIT_4 : num 70 82 105 114 118 204 202 99 73 95 ... $ EXIT_8 : num 1501 1631 1594 1576 1536 ... $ EXIT_12 : num 3862 3923 4158 3970 3895 ... $ EXIT_16 : num 1559 1539 1737 1681 1795 ... $ EXIT

Error in scale.default: length of 'center' must equal the number of columns of 'x'

本小妞迷上赌 提交于 2019-12-05 15:07:45
I am using mboost package to do some classification. Here is the code library('mboost') load('so-data.rdata') model <- glmboost(is_exciting~., data=training, family=Binomial()) pred <- predict(model, newdata=test, type="response") But R complains when doing prediction that Error in scale.default(X, center = cm, scale = FALSE) : length of 'center' must equal the number of columns of 'x' The data ( training and test ) can be downloaded here ( 7z , zip ). What is the reason of the error and how to get rid of it? Thank you. UPDATE : > str(training) 'data.frame': 439599 obs. of 24 variables: $ is

StandardScaler with Pipelines and GridSearchCV

▼魔方 西西 提交于 2019-12-05 15:03:35
I've put standardScaler on the pipeline, and the results of CV_mlpregressor.predict(x_test), are weird. I think i must have to bring the values back from the standardScaler, but still can't figure how. pipe_MLPRegressor = Pipeline([('scaler', StandardScaler()), ('MLPRegressor', MLPRegressor(random_state = 42))]) grid_params_MLPRegressor = [{ 'MLPRegressor__solver': ['lbfgs'], 'MLPRegressor__max_iter': [100,200,300,500], 'MLPRegressor__activation' : ['relu','logistic','tanh'], 'MLPRegressor__hidden_layer_sizes':[(2,), (4,),(2,2),(4,4),(4,2),(10,10),(2,2,2)], }] CV_mlpregressor = GridSearchCV