regression | 易学教程

best-found PCA estimator to be used as the estimator in RFECV

阅读更多关于 best-found PCA estimator to be used as the estimator in RFECV

问题 This works (mostly from the demo sample at sklearn): print(__doc__) # Code source: Gaël Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model, decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV from scipy.stats import uniform lregress = LinearRegression() pca = decomposition.PCA() pipe = Pipeline(steps=[('pca', pca), (

Regression with equality and inequality constrained coefficients in R

阅读更多关于 Regression with equality and inequality constrained coefficients in R

问题 I am trying to obtain estimated constrained coefficients using RSS. The beta coefficients are constrained between [0,1] and sum to 1. Additionally, my third parameter is constrained between (-1,1). Utilizing the below I can obtain a nice solution using simulated variables but when implementing the methodology on my real data set I keep arriving at a non-unique solution. In turn, I'm wondering if there is a more numerically stable way to obtain my estimated parameters. set.seed(234) k = 2 a =

How to interpret MSE in Keras Regressor

阅读更多关于 How to interpret MSE in Keras Regressor

问题 I am new to Keras/TF/Deep Learning and I am trying to build a model to predict house prices. I have some features X (no. of bathrooms , etc.) and target Y (ranging around $300,000 to $800,000) I have used sklearn's Standard Scaler to standardize Y before fitting it to the model. Here is my Keras model: def build_model(): model = Sequential() model.add(Dense(36, input_dim=36, activation='relu')) model.add(Dense(18, input_dim=36, activation='relu')) model.add(Dense(1, activation='sigmoid'))

Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

阅读更多关于 Missing values in MS Excel LINEST, TREND, LOGEST and GROWTH functions

问题 I'm using the GROWTH (or LINEST or TREND or LOGEST, all make the same trouble) function in Excel 2003. But there is a problem that if some data is missing, the function refuses to give result: You can download the file here. Is there any workaround? Looking for easy and elegant solution. I don't want the obvious workaround of getting rid of the missing value - that would mean to delete the column and that would also damage the graph, and it would make problems in my other tables where I have

geom_smooth on a subset of data

阅读更多关于 geom_smooth on a subset of data

问题 Here is some data and a plot: set.seed(18) data = data.frame(y=c(rep(0:1,3),rnorm(18,mean=0.5,sd=0.1)),colour=rep(1:2,12),x=rep(1:4,each=6)) ggplot(data,aes(x=x,y=y,colour=factor(colour)))+geom_point()+ geom_smooth(method='lm',formula=y~x,se=F) As you can see the linear regression is highly influenced by the values where x=1. Can I get linear regressions calculated for x >= 2 but display the values for x=1 (y equals either 0 or 1). The resulting graph would be exactly the same except for the

How to compute standard error from ODR results?

阅读更多关于 How to compute standard error from ODR results?

问题 I use scipy.odr in order to make a fit with uncertainties on both x and y following this question Correct fitting with scipy curve_fit including errors in x? After the fit I would like to compute the uncertainties on the parameters. Thus I look at the square root of the diagonal elements of the covariance matrix. I get : >>> print(np.sqrt(np.diag(output.cov_beta))) [ 0.17516591 0.33020487 0.27856021] But in the Output there is also output.sd_beta which is, according to the doc on odr Standard

Activation function for output layer for regression models in Neural Networks

阅读更多关于 Activation function for output layer for regression models in Neural Networks

问题 I have been experimenting with neural networks these days. I have come across a general question regarding the activation function to use. This might be a well known fact to but I couldn't understand properly. A lot of the examples and papers I have seen are working on classification problems and they either use sigmoid (in binary case) or softmax (in multi-class case) as the activation function in the out put layer and it makes sense. But I haven't seen any activation function used in the

Weighted Least Square

阅读更多关于 Weighted Least Square

问题 I want to do a regression of y~x (just 1 dependent and 1 independent variable) but I have heteroskedasticity. The variability of y increases as x increases. To deal with it, I would like to use weighted least squares through the "gls()" function in R. But I have to admit that I don't understand how to use it. I have to apply a variance function to the "weights" argument of the gls function. But I don't which one to choose and how to use it. 回答1: Here's an example of taking care of poisson

Weighted Least Square

阅读更多关于 Weighted Least Square

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

阅读更多关于 Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

问题 I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se , etc.) However, I can't quite figure out how to get the t -tests on the coefficients to use these corrected standard errors. Is there a way to do this in the API, or do I have to do it manually? If the latter, can you suggest any guidance on how to do this with statsmodels results?