regression

NaNs produced in negative binomial regression when using dnbinom()

荒凉一梦 提交于 2019-12-11 05:47:38
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . I am using dnbinom() for writing the log-likelihood function and then estimate parameters using mle2() {bbmle} in R. The problem is that I got 16 warnings for my negative binomial model, all of them NaNs produced like this one: 1: In dnbinom(y, mu = mu, size = k, log = TRUE) : NaNs produced My code: # data x <- c(0.35,0.45,0.90,0.05,1.00,0.50,0.45,0.25,0.15,0.40,0.26,0.37,0.43

regarding handling many binary independent variables in lm

孤街浪徒 提交于 2019-12-11 05:23:43
问题 When building the linear regression model using lm , the data set has about 20 independent variables. Do I need to explicitly clarify them as factor ? If I have to, how can I do that? It can be very tedious to declare one by one. 回答1: First, check which variables R has automatically converted into factors with the commande str(mydata) Then if you want to convert several variable into factors easily, you can do something like this: create a "mycol" variable with the No of columns you want to

R: Clustering standard errors in MASS::polr()

本秂侑毒 提交于 2019-12-11 05:14:50
问题 I am trying to estimate an ordinal logistic regression with clustered standard errors using the MASS package's polr() function. There is no built-in clustering feature, so I am looking for (a) packages or (b) manual methods for calculating clustered standard errors using the model output . I plan to use margins package to estimate marginal effects from the model. Here is an example: library(MASS) set.seed(1) obs <- 500 # Create data frame dat <- data.frame(y = as.factor(round(rnorm(n = obs,

R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?

こ雲淡風輕ζ 提交于 2019-12-11 05:07:43
问题 I have soil moisture data with x-, y- and z-coordinates like this: gue <- structure(list(x = c(311939.1507, 311935.4607, 311924.7316, 311959.553, 311973.5368, 311953.3743, 311957.9409, 311948.3151, 311946.7169, 311997.0803, 312017.5236, 312006.0245, 312001.5179, 311992.7044, 311977.3076, 311960.4159, 311970.6047, 311957.2564, 311866.4246, 311870.8714, 311861.4461, 311928.7096, 311929.6291, 311929.4233, 311891.2915, 311890.3429, 311900.8905, 311864.4995, 311870.8143, 311866.9257, 312002.571,

Plotting linear regression with Date/Week on x axis using Seaborn

拈花ヽ惹草 提交于 2019-12-11 05:07:15
问题 Data my company uses some weird date notation, which has this format: [2 digits week number][2 digit working hours] . Both groups use leading zeros. So the data could like: 0801, 0802, 0901, 0902, 0903, 1001, 1002, 1003 For each of this "dates" there is a scoring. This is just regular floating numbers from 0 to 100. Example (csv): wxxhxx,scoring 0101,5.3 0102,6.6 0103,6.2 Here is some sample data Example With this data I want to create a scatter plot including a linear regression ! I was able

Plot regression line on a scatter plot from regression coefficients

我是研究僧i 提交于 2019-12-11 04:25:06
问题 I am trying to draw regression lines using the following: https://observablehq.com/@harrystevens/introducing-d3-regression#linear I followed the tutorial and added the following code, dataLinear = [{x: 8, y: 3},{x: 2, y: 10},{x: 11, y: 3},{x: 6, y: 6},{x: 5, y: 8},{x: 4, y: 12},{x: 12, y: 1},{x: 9, y: 4},{x: 6, y: 9},{x: 1, y: 14}] linearRegression = d3.regressionLinear() .x(d => d.x) .y(d => d.y) .domain([-1.7, 16]); res = linearRegression(dataLinear) console.log(res) Now, I get back the

How to get measures of model fit (AIC, F-statistics) in zelig for multiply imputed data?

三世轮回 提交于 2019-12-11 03:45:08
问题 Following up on an earlier post, I am interested in learning how to get the usual measures of the relative quality of a statistical model in zelig for regression using multiply imputed data (created with Amelia). require(Zelig) require(Amelia) data(freetrade) #Imputation of missing data a.out <- amelia(freetrade, m=5, ts="year", cs="country") # Regression model z.out <- zelig(polity~tariff+gdp.pc, model="ls", data=a.out$imputations) summary(z.out) Model: ls Number of multiply imputed data

unused arguments error using apply() in R

喜你入骨 提交于 2019-12-11 03:42:59
问题 I get an error message when I attempt to use apply() conditional on a column of dates to return a set of coefficients. I have a dataset (herein modified for simplicity, but reproducible): ADataset <- data.table(Epoch = c("2007-11-15", "2007-11-16", "2007-11-17", "2007-11-18", "2007-11-19", "2007-11-20", "2007-11-21"), Distance = c("92336.22", "92336.23", "92336.22", "92336.20", "92336.19", "92336.21", "92336.18)) ADataset Epoch Distance 1: 2007-11-15 92336.22 2: 2007-11-16 92336.23 3: 2007-11

How to output several variables in the same row using stargazer in R

一笑奈何 提交于 2019-12-11 03:35:20
问题 I would like to output the interaction terms from several regressions in the same row and call it "Interaction". So far what I have is that the interaction terms show up in two different rows called "Interaction" (see code below). This question has already been asked here, but my score isn't high enough yet to upvote it or comment on it: https://stackoverflow.com/questions/28859569/several-coefficients-in-one-line. library("stargazer") stargazer(attitude) stargazer(attitude, summary=FALSE) #

rstudent() returns incorrect result for an “mlm” (linear models fitted with multiple LHS)

巧了我就是萌 提交于 2019-12-11 03:19:30
问题 I know that the support for linear models with multiple LHS is limited. But when it is possible to run a function on an "mlm" object, I would expect the results to be trusty. When using rstudent , strange results are produced. Is this a bug or is there some other explanation? In the example below fittedA and fittedB are identical, but in the case of rstudent the 2nd column differs. y <- matrix(rnorm(20), 10, 2) x <- 1:10 fittedA <- fitted(lm(y ~ x)) fittedB <- cbind(fitted(lm(y[, 1] ~ x)),