regression

best-found PCA estimator to be used as the estimator in RFECV

老子叫甜甜 提交于 2019-12-31 06:59:05
问题 This works (mostly from the demo sample at sklearn): print(__doc__) # Code source: Gaël Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model, decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV from scipy.stats import uniform lregress = LinearRegression() pca = decomposition.PCA() pipe = Pipeline(steps=[('pca', pca), (

Regression with equality and inequality constrained coefficients in R

人走茶凉 提交于 2019-12-31 04:28:07
问题 I am trying to obtain estimated constrained coefficients using RSS. The beta coefficients are constrained between [0,1] and sum to 1. Additionally, my third parameter is constrained between (-1,1). Utilizing the below I can obtain a nice solution using simulated variables but when implementing the methodology on my real data set I keep arriving at a non-unique solution. In turn, I'm wondering if there is a more numerically stable way to obtain my estimated parameters. set.seed(234) k = 2 a =

How to interpret MSE in Keras Regressor

◇◆丶佛笑我妖孽 提交于 2019-12-31 03:00:58
问题 I am new to Keras/TF/Deep Learning and I am trying to build a model to predict house prices. I have some features X (no. of bathrooms , etc.) and target Y (ranging around $300,000 to $800,000) I have used sklearn's Standard Scaler to standardize Y before fitting it to the model. Here is my Keras model: def build_model(): model = Sequential() model.add(Dense(36, input_dim=36, activation='relu')) model.add(Dense(18, input_dim=36, activation='relu')) model.add(Dense(1, activation='sigmoid'))

Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

删除回忆录丶 提交于 2019-12-31 01:47:16
问题 I'm using the GROWTH (or LINEST or TREND or LOGEST, all make the same trouble) function in Excel 2003. But there is a problem that if some data is missing, the function refuses to give result: You can download the file here. Is there any workaround? Looking for easy and elegant solution. I don't want the obvious workaround of getting rid of the missing value - that would mean to delete the column and that would also damage the graph, and it would make problems in my other tables where I have

geom_smooth on a subset of data

天涯浪子 提交于 2019-12-30 17:23:51
问题 Here is some data and a plot: set.seed(18) data = data.frame(y=c(rep(0:1,3),rnorm(18,mean=0.5,sd=0.1)),colour=rep(1:2,12),x=rep(1:4,each=6)) ggplot(data,aes(x=x,y=y,colour=factor(colour)))+geom_point()+ geom_smooth(method='lm',formula=y~x,se=F) As you can see the linear regression is highly influenced by the values where x=1. Can I get linear regressions calculated for x >= 2 but display the values for x=1 (y equals either 0 or 1). The resulting graph would be exactly the same except for the

How to compute standard error from ODR results?

巧了我就是萌 提交于 2019-12-30 08:23:10
问题 I use scipy.odr in order to make a fit with uncertainties on both x and y following this question Correct fitting with scipy curve_fit including errors in x? After the fit I would like to compute the uncertainties on the parameters. Thus I look at the square root of the diagonal elements of the covariance matrix. I get : >>> print(np.sqrt(np.diag(output.cov_beta))) [ 0.17516591 0.33020487 0.27856021] But in the Output there is also output.sd_beta which is, according to the doc on odr Standard

Activation function for output layer for regression models in Neural Networks

泄露秘密 提交于 2019-12-30 07:53:16
问题 I have been experimenting with neural networks these days. I have come across a general question regarding the activation function to use. This might be a well known fact to but I couldn't understand properly. A lot of the examples and papers I have seen are working on classification problems and they either use sigmoid (in binary case) or softmax (in multi-class case) as the activation function in the out put layer and it makes sense. But I haven't seen any activation function used in the

Weighted Least Square

混江龙づ霸主 提交于 2019-12-30 06:44:46
问题 I want to do a regression of y~x (just 1 dependent and 1 independent variable) but I have heteroskedasticity. The variability of y increases as x increases. To deal with it, I would like to use weighted least squares through the "gls()" function in R. But I have to admit that I don't understand how to use it. I have to apply a variance function to the "weights" argument of the gls function. But I don't which one to choose and how to use it. 回答1: Here's an example of taking care of poisson

Weighted Least Square

天涯浪子 提交于 2019-12-30 06:44:01
问题 I want to do a regression of y~x (just 1 dependent and 1 independent variable) but I have heteroskedasticity. The variability of y increases as x increases. To deal with it, I would like to use weighted least squares through the "gls()" function in R. But I have to admit that I don't understand how to use it. I have to apply a variance function to the "weights" argument of the gls function. But I don't which one to choose and how to use it. 回答1: Here's an example of taking care of poisson

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

删除回忆录丶 提交于 2019-12-30 03:10:05
问题 I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se , etc.) However, I can't quite figure out how to get the t -tests on the coefficients to use these corrected standard errors. Is there a way to do this in the API, or do I have to do it manually? If the latter, can you suggest any guidance on how to do this with statsmodels results?