regression

Regression with equality and inequality constrained coefficients in R

可紊 提交于 2019-12-02 04:12:31
I am trying to obtain estimated constrained coefficients using RSS. The beta coefficients are constrained between [0,1] and sum to 1. Additionally, my third parameter is constrained between (-1,1). Utilizing the below I can obtain a nice solution using simulated variables but when implementing the methodology on my real data set I keep arriving at a non-unique solution. In turn, I'm wondering if there is a more numerically stable way to obtain my estimated parameters. set.seed(234) k = 2 a = diff(c(0, sort(runif(k-1)), 1)) n = 1e4 x = matrix(rnorm(k*n), nc = k) a2 = -0.5 y = a2 * (x %*% a) +

Using Scipy curve_fit with variable number of parameters to optimize

血红的双手。 提交于 2019-12-02 03:54:08
Assuming we have the below function to optimize for 4 parameters, we have to write the function as below, but if we want the same function with more number of parameters, we have to rewrite the function definition. def radius (z,a0,a1,k0,k1,): k = np.array([k0,k1,]) a = np.array([a0,a1,]) w = 1.0 phi = 0.0 rs = r0 + np.sum(a*np.sin(k*z +w*t +phi), axis=1) return rs The question is if this can be done easier in a more automatic way, and more intuitive than this question suggests. example would be as following which has to be written by hand. def radius (z,a0,a1,a2,a3,a4,a5,a6,a7,a8,a9,k0,k1,k2

How do I make a regression tree like this?

匆匆过客 提交于 2019-12-02 03:53:10
问题 I would like to make a regression tree like the one in the picture. The tree was done in Cubist but I don't have that program. I do use R and Python. It seems to differ from the R packages rpart or tree in that the end nodes are linear formulas rather than just the average value. Is there any way I can do this using R or some other free software? In the picture, NDVI, B1,B2, etc are variables. The image is from this website. 回答1: Cubist is an R port of the Cubist GPL C code released by

Change reference group using glm with binomial family

£可爱£侵袭症+ 提交于 2019-12-02 03:41:08
问题 When I run a binomial regression in R with an independed factor variable consisting of three levels "Higher" , "Middle" and "Lower" of which I want to change the reference category using relevel I get this error: “Error in relevel.ordered(cbsnivcat3, "Lower") : 'relevel' only for factors” I have checked whether cbsnivcat3 is a factor > is.factor(data$cbsnivcat3) [1] TRUE > levels(data$cbsnivcat3) [1] "Higher" "Middle" "Lower" > t1m4=glm(tertiary ~ relevel(cbsnivcat3, "Lower") , family =

Incorrect abline line for a regression model with intercept in R

无人久伴 提交于 2019-12-02 03:16:44
(reproducible example given) In the following, I get an abline line with y-intercept is about 30, but the regression says y-intercept should be 37.2851 Where am I wrong? mtcars$mpg # 21.0 21.0 22.8 ... 21.4 (32 obs) mtcars$wt # 2.620 2.875 2.320 ... 2.780 (32 obs) regression1 <- lm(mtcars$mpg ~ mtcars$wt) coef(regression1) # mpg ~ 37.2851 - 5.3445wt plot(mtcars$mpg ~ mtcars$wt, pch=19, col='gray50') # pch: shape of points abline(h=mean(mtcars$mpg), lwd=2, col ='darkorange') # The y-coordinate of hor'l line: 20,09062 abline(lm(mtcars$mpg ~ mtcars$wt), lwd=2, col ='sienna') I looked at all the

Interpreting interactions in a regression model

走远了吗. 提交于 2019-12-02 02:41:41
问题 A simple question I hope. I have an experimental design where I measure some response (let's say blood pressure) from two groups: a control group and an affected group, where both are given three treatments: t1, t2, t3. The data are not paired in any sense. Here is an example data: set.seed(1) df <- data.frame(response = c(rnorm(5,10,1),rnorm(5,10,1),rnorm(5,10,1), rnorm(5,7,1),rnorm(5,5,1),rnorm(5,10,1)), group = as.factor(c(rep("control",15),rep("affected",15))), treatment = as.factor(rep(c

How do I make a regression tree like this?

北战南征 提交于 2019-12-02 02:01:50
I would like to make a regression tree like the one in the picture. The tree was done in Cubist but I don't have that program. I do use R and Python. It seems to differ from the R packages rpart or tree in that the end nodes are linear formulas rather than just the average value. Is there any way I can do this using R or some other free software? In the picture, NDVI, B1,B2, etc are variables. The image is from this website . Cubist is an R port of the Cubist GPL C code released by RuleQuest at http://rulequest.com/cubist-info.html . Using the example from help('cubist') and the original

How to specify covariates in a regression model

北慕城南 提交于 2019-12-02 01:49:19
The dataset I would like to analyse looks like this n <- 4000 tmp <- t(replicate(n, sample(49,6))) dat <- matrix(0, nrow=n, ncol=49) colnames(dat) <- paste("p", 1:49, sep="") dat <- as.data.frame(dat) dat[, "win.frac"] <- rnorm(n, mean=0.0176504, sd=0.002) for (i in 1:nrow(dat)) for (j in 1:6) dat[i, paste("p", tmp[i, j], sep="")] <- 1 str(dat) Now I would like to perform a regression with depended variable win.frac and all other variables ( p1 , ..., p49 ) as explanatory variables. However, with all approaches I tried I get the coefficient for p49 as NA, with the message "1 not defined

Run regression analysis on multiple subsets of pandas columns efficiently

坚强是说给别人听的谎言 提交于 2019-12-02 01:45:44
问题 I could have chosen to go for a shorter question that only focuses on the core-problem here that is list permutations . But the reason I'm bringing statsmodels and pandas into the question is that there may exist specific tools for step-wise regression that at the same time has the flexibilty of storing the desired regression output like I'm about to show you below, but that are much more efficient. At least I hope so. Given a dataframe like below: Code snippet 1: # Imports import pandas as

Interpreting interactions in a regression model

杀马特。学长 韩版系。学妹 提交于 2019-12-02 01:35:31
A simple question I hope. I have an experimental design where I measure some response (let's say blood pressure) from two groups: a control group and an affected group, where both are given three treatments: t1, t2, t3. The data are not paired in any sense. Here is an example data: set.seed(1) df <- data.frame(response = c(rnorm(5,10,1),rnorm(5,10,1),rnorm(5,10,1), rnorm(5,7,1),rnorm(5,5,1),rnorm(5,10,1)), group = as.factor(c(rep("control",15),rep("affected",15))), treatment = as.factor(rep(c(rep("t1",5),rep("t2",5),rep("t3",5)),2))) What I am interested in is quantifying the effect that each