regression | 易学教程

sklearn LogisticRegression without regularization

阅读更多关于 sklearn LogisticRegression without regularization

问题 Logistic regression class in sklearn comes with L1 and L2 regularization. How can I turn off regularization to get the "raw" logistic fit such as in glmfit in Matlab? I think I can set C = large number but I don't think it is wise. see for more details the documentation http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression 回答1: Yes, choose as large a number as possible. In regularization, the cost function

Curve Fitting 3D data set

阅读更多关于 Curve Fitting 3D data set

问题 The curve-fitting problem for 2D data is well known (LOWESS, etc.) but given a set of 3D data points, how do I fit a 3D curve (eg. a smoothing/regression spline) to this data? MORE: I'm trying to find a curve, fitting the data provided by vectors X,Y,Z which have no known relation. Essentially, I have a 3D point cloud, and need to find a 3D trendline. MORE: I apologize for the ambiguity. I tried several approaches (I still haven't tried modifying the linear fit) and a random NN seems to work

Factor levels default to 1 and 2 in R | Dummy variable

阅读更多关于 Factor levels default to 1 and 2 in R | Dummy variable

I am transitioning from Stata to R. In Stata, if I label a factor levels (say--0 and 1) to (M and F), 0 and 1 would remain as they are. Moreover, this is required for dummy-variable linear regression in most software including Excel and SPSS. However, I've noticed that R defaults factor levels to 1,2 instead of 0,1. I don't know why R does this although regression internally (and correctly) assumes 0 and 1 as the factor variable. I would appreciate any help. Here's what I did: Try #1: sex<-c(0,1,0,1,1) sex<-factor(sex,levels = c(1,0),labels = c("F","M")) str(sex) Factor w/ 2 levels "F","M": 2

Iteratively forecasting dyn models

阅读更多关于 Iteratively forecasting dyn models

I've written a function to iteratively forecast models built using the package dyn, and I'd like some feedback on it. Is there a better way to do this? Has someone written canonical "forecast" methods for the dyn class (or dynlm class), or am I venturing into uncharted territory here? ipredict <-function(model, newdata, interval = "none", level = 0.95, na.action = na.pass, weights = 1) { P<-predict(model,newdata=newdata,interval=interval, level=level,na.action=na.action,weights=weights) for (i in seq(1,dim(newdata)[1])) { if (is.na(newdata[i])) { if (interval=="none") { P[i]<-predict(model

Display regression equation in seaborn regplot [duplicate]

阅读更多关于 Display regression equation in seaborn regplot [duplicate]

问题 This question already has answers here : How to get the numerical fitting results when plotting a regression in seaborn? (2 answers) Closed last year . Does anyone know how to display the regression equation in seaborn using sns.regplot or sns.jointplot? regplot doesn't seem to have any parameter that you can be pass to display regression diagnostics, and jointplot only displays the pearson R^2, and p-value. I'm looking for a way to see the slope coefficient, standard error, and intercept as

How to obtain RMSE out of lm result?

阅读更多关于 How to obtain RMSE out of lm result?

问题 I know there is a small difference between $sigma and the concept of root mean squared error . So, i am wondering what is the easiest way to obtain RMSE out of lm function in R ? res<-lm(randomData$price ~randomData$carat+ randomData$cut+randomData$color+ randomData$clarity+randomData$depth+ randomData$table+randomData$x+ randomData$y+randomData$z) length(coefficients(res)) contains 24 coefficient, and I cannot make my model manually anymore. So, how can I evaluate the RMSE based on

Difference in Differences in Python + Pandas

阅读更多关于 Difference in Differences in Python + Pandas

I'm trying to perform a Difference in Differences (with panel data and fixed effects) analysis using Python and Pandas. I have no background in Economics and I'm just trying to filter the data and run the method that I was told to. However, as far as I could learn, I understood that the basic diff-in-diffs model looks like this: I.e., I am dealing with a multivariable model. Here it follows a simple example in R: https://thetarzan.wordpress.com/2011/06/20/differences-in-differences-estimation-in-r-and-stata/ As it can be seen, the regression takes as input one dependent variable and tree sets

Clustered Standard Errors with data containing NAs

阅读更多关于 Clustered Standard Errors with data containing NAs

I'm unable to cluster standard errors using R and guidance based on this post . The cl function returns the error: Error in tapply(x, cluster1, sum) : arguments must have same length After reading up on tapply I'm still not sure why my cluster argument is the wrong length, and what is causing this error. Here is a link to the data set that I'm using. https://www.dropbox.com/s/y2od7um9pp4vn0s/Ec%201820%20-%20DD%20Data%20with%20Controls.csv Here is the R code: # read in data charter<-read.csv(file.choose()) View(charter) colnames(charter) # standardize NAEP scores charter$naep.standardized <-

Multiple-output Gaussian Process regression in scikit-learn

阅读更多关于 Multiple-output Gaussian Process regression in scikit-learn

问题 I am using scikit learn for Gaussian process regression (GPR) operation to predict data. My training data are as follows: x_train = np.array([[0,0],[2,2],[3,3]]) #2-D cartesian coordinate points y_train = np.array([[200,250, 155],[321,345,210],[417,445,851]]) #observed output from three different datasources at respective input data points (x_train) The test points (2-D) where mean and variance/standard deviation need to be predicted are: xvalues = np.array([0,1,2,3]) yvalues = np.array([0,1

How to export coefficients of the regression analysis fto a spreadsheet or csv file?

阅读更多关于 How to export coefficients of the regression analysis fto a spreadsheet or csv file?

问题 I am new to RStudio and I guess my question is pretty easy to solve but a lot of searching did not help me. I am running a regression and summary(regression1) shows me all the coefficients and so on. Now I am using coef(regression1) so it only gives me the coefficients which I want to export to a file. write.csv(coef, file="regression1.csv) and the "Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ""function"" to a data.frame" occurs. Would be great If you could