regression | 易学教程

Error in `contrasts' Error

阅读更多关于 Error in `contrasts' Error

问题 I have trained a model and I am attempting to use the predict function but it returns the following error. Error in contrasts<- ( *tmp* , value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels There are several questions in SO and CrossValidated about this, and from what I interpret this error to be, is one factor in my model has only one level. This is a pretty simple model, with one continuous variable (driveTime) and one factor variable which has

Robust se. (vcovHC) to be shown with texreg in R

阅读更多关于 Robust se. (vcovHC) to be shown with texreg in R

问题 I am doing some regressions with the plm package, then if needed, I also obtain heteroskedasticity consistent coefficients. Below are the commands that I run; library(plm) data("Produc", package = "plm") zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state","year")) summary(zz) coeftest(zz, vcovHC) My problem starts here. Below is the list of commands to obtain a Latex output with the help of the texreg. How can I integrate the result obtained with the

R: plm — year fixed effects — year and quarter data

阅读更多关于 R: plm — year fixed effects — year and quarter data

I am having a problem setting up a panel data model. Here is some sample data: library(plm) id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2) year <- c(1999,1999,1999,1999,2000,2000,2000,2000,1999,1999,1999,1999,2000,2000,2000,2000) qtr <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4) y <- rnorm(16, mean=0, sd=1) x <- rnorm(16, mean=0, sd=1) data <- data.frame(id=id,year=year,qtr=qtr,y_q=paste(year,qtr,sep="_"),y=y,x=x) I run the following regression using 'id' as the individual index and 'year' as the time index: reg1 <- plm(y ~ x, data=data,index=c("id", "year"), model="within",effect="time") Unfortunately, I

interpreting Graphviz output for decision tree regression

阅读更多关于 interpreting Graphviz output for decision tree regression

I'm curious what the value field is in the nodes of the decision tree produced by Graphviz when used for regression. I understand that this is the number of samples in each class that are separated by a split when using decision tree classification but I'm not sure what it means for regression. My data has a 2 dimensional input and a 10 dimensional output. Here is an example of what a tree looks like for my regression problem: produced using this code & visualized with webgraphviz # X = (n x 2) Y = (n x 10) X_test = (m x 2) input_scaler = pickle.load(open("../input_scaler.sav","rb")) reg =

How to find multivariable regression equation in javascript

阅读更多关于 How to find multivariable regression equation in javascript

I have searched stack overflow and have not found any question that really is the same as mine because none really have more than one independent variable. Basically I have an array of datapoints and I want to be able to find a regression equation for those data points. The code I have so far looks like this: (w,x,z are the independent variables and y is the dependent variable) var dataPoints = [{ "w" : 1, "x" : 2, "z" : 1, "y" : 7 }, { "w" : 2, "x" : 1, "z" : 4, "y" : 5 }, { "w" : 1, "x" : 5, "z" : 3, "y" : 2 }, { "w" : 4, "x" : 3, "z" : 5, "y" : 15 }]; I would like a function that would

Using categorical data as features in sklean LogisticRegression

阅读更多关于 Using categorical data as features in sklean LogisticRegression

I'm trying to understand how to use categorical data as features in sklearn.linear_model 's LogisticRegression . I understand of course I need to encode it. What I don't understand is how to pass the encoded feature to the Logistic regression so it's processed as a categorical feature, and not interpreting the int value it got when encoding as a standard quantifiable feature. (Less important) Can somebody explain the difference between using preprocessing.LabelEncoder() , DictVectorizer.vocabulary or just encoding the categorical data yourself with a simple dict? Alex A.'s comment here touches

Ignoring missing values in multiple OLS regression with statsmodels

阅读更多关于 Ignoring missing values in multiple OLS regression with statsmodels

I'm trying to run a multiple OLS regression using statsmodels and a pandas dataframe. There are missing values in different columns for different rows, and I keep getting the error message: ValueError: array must not contain infs or NaNs I saw this SO question, which is similar but doesn't exactly answer my question: statsmodel.api.Logit: valueerror array must not contain infs or nans What I would like to do is run the regression and ignore all rows where there are missing variables for the variables I am using in this regression. Right now I have: import pandas as pd import numpy as np import

Problems displaying LOESS regression line and confidence interval

阅读更多关于 Problems displaying LOESS regression line and confidence interval

问题 I am having some issues trying to compete a LOESS regression with a data set. I have been able to properly create the line, but I am unable to get it to plot correctly. I ran through the data like this. animals.lo <- loess(X15p5 ~ Period, animals, weights = n.15p5) animals.lo summary(animals.lo) plot(X15p5~ Period, animals) lines(animals$X15p5, animals.lo, col="red") At this point I received an error "Error in xy.coords(x, y) : 'x' and 'y' lengths differ" I searched around and read that this

`nls` fitting error: always reach maximum number of iterations regardless starting values

阅读更多关于 `nls` fitting error: always reach maximum number of iterations regardless starting values

问题 Using this parametrization for a growth curve logistic model I created some points with: K =0.7 ; y0=0.01 ; r =0.3 df = data.frame(x= seq(1, 50, by = 5)) df$y = 0.7/(1+((0.7-0.01)/0.01)*exp(-0.3*df$x)) Can someone tell me how can I have a fitting error if create the data with the model starters? fo = df$y ~ K/(1+((K-y0)/y0)*exp(-r*df$x)) model<-nls(fo, start = list(K=0.7, y0=0.01, r=0.3), df, nls.control(maxiter = 1000)) Error in nls(fo, start = list(K = 0.7, y0 = 0.01, r = 0.3), df, nls

Non linear Regression: Why isn't the model learning?

阅读更多关于 Non linear Regression: Why isn't the model learning?

问题 I just started learning keras. I am trying to train a non-linear regression model in keras but model doesn't seem to learn much. #datapoints X = np.arange(0.0, 5.0, 0.1, dtype='float32').reshape(-1,1) y = 5 * np.power(X,2) + np.power(np.random.randn(50).reshape(-1,1),3) #model model = Sequential() model.add(Dense(50, activation='relu', input_dim=1)) model.add(Dense(30, activation='relu', init='uniform')) model.add(Dense(output_dim=1, activation='linear')) #training sgd = SGD(lr=0.1); model