regression | 易学教程

NaNs produced in negative binomial regression when using dnbinom()

阅读更多关于 NaNs produced in negative binomial regression when using dnbinom()

问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . I am using dnbinom() for writing the log-likelihood function and then estimate parameters using mle2() {bbmle} in R. The problem is that I got 16 warnings for my negative binomial model, all of them NaNs produced like this one: 1: In dnbinom(y, mu = mu, size = k, log = TRUE) : NaNs produced My code: # data x <- c(0.35,0.45,0.90,0.05,1.00,0.50,0.45,0.25,0.15,0.40,0.26,0.37,0.43

regarding handling many binary independent variables in lm

阅读更多关于 regarding handling many binary independent variables in lm

问题 When building the linear regression model using lm , the data set has about 20 independent variables. Do I need to explicitly clarify them as factor ? If I have to, how can I do that? It can be very tedious to declare one by one. 回答1: First, check which variables R has automatically converted into factors with the commande str(mydata) Then if you want to convert several variable into factors easily, you can do something like this: create a "mycol" variable with the No of columns you want to

R: Clustering standard errors in MASS::polr()

阅读更多关于 R: Clustering standard errors in MASS::polr()

问题 I am trying to estimate an ordinal logistic regression with clustered standard errors using the MASS package's polr() function. There is no built-in clustering feature, so I am looking for (a) packages or (b) manual methods for calculating clustered standard errors using the model output . I plan to use margins package to estimate marginal effects from the model. Here is an example: library(MASS) set.seed(1) obs <- 500 # Create data frame dat <- data.frame(y = as.factor(round(rnorm(n = obs,

R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?

阅读更多关于 R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?

问题 I have soil moisture data with x-, y- and z-coordinates like this: gue <- structure(list(x = c(311939.1507, 311935.4607, 311924.7316, 311959.553, 311973.5368, 311953.3743, 311957.9409, 311948.3151, 311946.7169, 311997.0803, 312017.5236, 312006.0245, 312001.5179, 311992.7044, 311977.3076, 311960.4159, 311970.6047, 311957.2564, 311866.4246, 311870.8714, 311861.4461, 311928.7096, 311929.6291, 311929.4233, 311891.2915, 311890.3429, 311900.8905, 311864.4995, 311870.8143, 311866.9257, 312002.571,

Plotting linear regression with Date/Week on x axis using Seaborn

阅读更多关于 Plotting linear regression with Date/Week on x axis using Seaborn

问题 Data my company uses some weird date notation, which has this format: [2 digits week number][2 digit working hours] . Both groups use leading zeros. So the data could like: 0801, 0802, 0901, 0902, 0903, 1001, 1002, 1003 For each of this "dates" there is a scoring. This is just regular floating numbers from 0 to 100. Example (csv): wxxhxx,scoring 0101,5.3 0102,6.6 0103,6.2 Here is some sample data Example With this data I want to create a scatter plot including a linear regression ! I was able

Plot regression line on a scatter plot from regression coefficients

阅读更多关于 Plot regression line on a scatter plot from regression coefficients

问题 I am trying to draw regression lines using the following: https://observablehq.com/@harrystevens/introducing-d3-regression#linear I followed the tutorial and added the following code, dataLinear = [{x: 8, y: 3},{x: 2, y: 10},{x: 11, y: 3},{x: 6, y: 6},{x: 5, y: 8},{x: 4, y: 12},{x: 12, y: 1},{x: 9, y: 4},{x: 6, y: 9},{x: 1, y: 14}] linearRegression = d3.regressionLinear() .x(d => d.x) .y(d => d.y) .domain([-1.7, 16]); res = linearRegression(dataLinear) console.log(res) Now, I get back the

How to get measures of model fit (AIC, F-statistics) in zelig for multiply imputed data?

阅读更多关于 How to get measures of model fit (AIC, F-statistics) in zelig for multiply imputed data?

问题 Following up on an earlier post, I am interested in learning how to get the usual measures of the relative quality of a statistical model in zelig for regression using multiply imputed data (created with Amelia). require(Zelig) require(Amelia) data(freetrade) #Imputation of missing data a.out <- amelia(freetrade, m=5, ts="year", cs="country") # Regression model z.out <- zelig(polity~tariff+gdp.pc, model="ls", data=a.out$imputations) summary(z.out) Model: ls Number of multiply imputed data

unused arguments error using apply() in R

阅读更多关于 unused arguments error using apply() in R

问题 I get an error message when I attempt to use apply() conditional on a column of dates to return a set of coefficients. I have a dataset (herein modified for simplicity, but reproducible): ADataset <- data.table(Epoch = c("2007-11-15", "2007-11-16", "2007-11-17", "2007-11-18", "2007-11-19", "2007-11-20", "2007-11-21"), Distance = c("92336.22", "92336.23", "92336.22", "92336.20", "92336.19", "92336.21", "92336.18)) ADataset Epoch Distance 1: 2007-11-15 92336.22 2: 2007-11-16 92336.23 3: 2007-11

How to output several variables in the same row using stargazer in R

阅读更多关于 How to output several variables in the same row using stargazer in R

问题 I would like to output the interaction terms from several regressions in the same row and call it "Interaction". So far what I have is that the interaction terms show up in two different rows called "Interaction" (see code below). This question has already been asked here, but my score isn't high enough yet to upvote it or comment on it: https://stackoverflow.com/questions/28859569/several-coefficients-in-one-line. library("stargazer") stargazer(attitude) stargazer(attitude, summary=FALSE) #

rstudent() returns incorrect result for an “mlm” (linear models fitted with multiple LHS)

阅读更多关于 rstudent() returns incorrect result for an “mlm” (linear models fitted with multiple LHS)

问题 I know that the support for linear models with multiple LHS is limited. But when it is possible to run a function on an "mlm" object, I would expect the results to be trusty. When using rstudent , strange results are produced. Is this a bug or is there some other explanation? In the example below fittedA and fittedB are identical, but in the case of rstudent the 2nd column differs. y <- matrix(rnorm(20), 10, 2) x <- 1:10 fittedA <- fitted(lm(y ~ x)) fittedB <- cbind(fitted(lm(y[, 1] ~ x)),