linear-regression

ValueError: Error when checking target: expected dense_44 to have shape (1,) but got array with shape (3,). They seem to match though

非 Y 不嫁゛ 提交于 2020-02-06 08:39:06
问题 I've searched several similar topics covering comparable problems. For example this, this and this, among others. Despite this, I still haven't managed to solve my issue, why I now try to ask the community. What I'm ultimately trying to do is with a CNN and regression predict three parameters. The inputs are matrices (and can now be plotted as RGB images after I've pre-processed them in several steps) with the initial size of (3724, 4073, 3). Due to the size of the data set I'm feeding the

Very large loss values when training multiple regression model in Keras

╄→尐↘猪︶ㄣ 提交于 2020-02-06 07:38:10
问题 I was trying to build a multiple regression model to predict housing prices using the following features: [bedrooms bathrooms sqft_living view grade] = [0.09375 0.266667 0.149582 0.0 0.6] I have standardized and scaled the features using sklearn.preprocessing.MinMaxScaler . I used Keras to build the model: def build_model(X_train): model = Sequential() model.add(Dense(5, activation = 'relu', input_shape = X_train.shape[1:])) model.add(Dense(1)) optimizer = Adam(lr = 0.001) model.compile(loss

Dummy Variable in Multiple Linear Regression

为君一笑 提交于 2020-01-24 20:38:10
问题 Why do we take one less dummy variable than the total number of dummy variables in a Multiple Linear regression model? Like, if the model contains 4 dummy variables, we update our features vector for training our regression model. x = x[:, 1:4] . 回答1: Because of the Dummy Variable Trap. By including dummy variable in a regression model however, one should be careful of the Dummy Variable Trap. The Dummy Variable trap is a scenario in which the independent variables are multicollinear - a

How to plot from MuMIn model.avg() summary

我的未来我决定 提交于 2020-01-24 12:45:11
问题 Is there a way to directly plot model average summary outputs from MuMIn model.avg() for different variables with confidence bands. Previously I had been using ggplot and ggpredict to plot terms from the actual models, but I haven't been able to find a way to plot the results of the averaged models. Clearly I can plot the slope and intercept manually, but getting accurate confidence bands and plotting from confint() is not ideal and I have yet to get confidence bands from the intervals that

negative value for “mean_squared_error”

◇◆丶佛笑我妖孽 提交于 2020-01-23 12:22:08
问题 I am using scikit and using mean_squared_error as a scoring function for model evaluation in cross_val_score. rms_score = cross_validation.cross_val_score(model, X, y, cv=20, scoring='mean_squared_error') I am using mean_squared_error as it is a regression problem and the estimators (model) used are lasso , ridge and elasticNet . For all these estimators, I am getting rms_score as negative values. How is it possible, given the fact that the differences in y values are squared. 回答1: You get

Spark ml and PMML export

跟風遠走 提交于 2020-01-23 05:54:20
问题 I know that it's possible to export models as PMML with Spark-MLlib , but what about Spark-ML ? Is it possible to convert LinearRegressionModel from org.apache.spark.ml.regression to a LinearRegressionModel from org.apache.spark.mllib.regression to be able to invoke the toPMML() method? 回答1: You can convert Spark ML pipelines to PMML using the JPMML-SparkML library: StructType schema = dataFrame.schema() PipelineModel pipelineModel = pipeline.fit(dataFrame); org.dmg.pmml.PMML pmml = org.jpmml

Spark ml and PMML export

徘徊边缘 提交于 2020-01-23 05:54:08
问题 I know that it's possible to export models as PMML with Spark-MLlib , but what about Spark-ML ? Is it possible to convert LinearRegressionModel from org.apache.spark.ml.regression to a LinearRegressionModel from org.apache.spark.mllib.regression to be able to invoke the toPMML() method? 回答1: You can convert Spark ML pipelines to PMML using the JPMML-SparkML library: StructType schema = dataFrame.schema() PipelineModel pipelineModel = pipeline.fit(dataFrame); org.dmg.pmml.PMML pmml = org.jpmml

In R package “segmented”, How could I set the slope of one of lines in the model to 0?

为君一笑 提交于 2020-01-23 01:18:11
问题 I am using the R package segmented to calculate parameters for a model, in which the response variable is linearly correlated with the explanatory variable until a breakpoint, then the response variable becomes independent from the explanatory variable. In other words, a segmented linear model with the second part having a slope = 0. What I already did is: linear1 <- lm(Y ~ X) linear2 <- segmented (linear1, seg.Z = ~ X, psi = 2) This gives a model that have a very good first line, but the

Why do I get NA coefficients and how does `lm` drop reference level for interaction

别来无恙 提交于 2020-01-21 07:26:06
问题 I am trying to understand how R determines reference groups for interactions in a linear model. Consider the following: df <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), year = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1", "2"),

How to only print (adjusted) R-squared of regression model?

半世苍凉 提交于 2020-01-16 18:30:29
问题 I am a beginner with R. I have a data set on air pollution. The columns are site, measured concentration and 80 variables (v1-v80) that might influence the concentration. I want to make a model with forward stepwise regression based on R-squared/adj with my own code (so I do not want to use something like step() or regsubset()). The dependent variable is concentration and the variables v1-v80 as independent variables. I wrote the following code for the first step (data set is simplified):