regression | 易学教程

Fit a non-linear function to data/observations with pyMCMC/pyMC

阅读更多关于 Fit a non-linear function to data/observations with pyMCMC/pyMC

问题 I am trying to fit some data with a Gaussian (and more complex) function(s). I have created a small example below. My first question is, am I doing it right? My second question is, how do I add an error in the x-direction, i.e. in the x-position of the observations/data? It is very hard to find nice guides on how to do this kind of regression in pyMC. Perhaps because its easier to use some least squares, or similar approach, I however have many parameters in the end and need to see how well

Run an OLS regression with Pandas Data Frame

阅读更多关于 Run an OLS regression with Pandas Data Frame

问题 I have a pandas data frame and I would like to able to predict the values of column A from the values in columns B and C. Here is a toy example: import pandas as pd df = pd.DataFrame({"A": [10,20,30,40,50], "B": [20, 30, 10, 40, 50], "C": [32, 234, 23, 23, 42523]}) Ideally, I would have something like ols(A ~ B + C, data = df) but when I look at the examples from algorithm libraries like scikit-learn it appears to feed the data to the model with a list of rows instead of columns. This would

How `poly()` generates orthogonal polynomials? How to understand the “coefs” returned?

阅读更多关于 How `poly()` generates orthogonal polynomials? How to understand the “coefs” returned?

问题 My understanding of orthogonal polynomials is that they take the form y(x) = a1 + a2(x - c1) + a3(x - c2)(x - c3) + a4(x - c4)(x - c5)(x - c6)... up to the number of terms desired where a1 , a2 etc are coefficients to each orthogonal term (vary between fits), and c1 , c2 etc are coefficients within the orthogonal terms, determined such that the terms maintain orthogonality (consistent between fits using the same x values) I understand poly() is used to fit orthogonal polynomials. An example x

`lm` summary not display all factor levels

阅读更多关于 `lm` summary not display all factor levels

I am running a linear regression on a number of attributes including two categorical attributes, B and F , and I don't get a coefficient value for every factor level I have. B has 9 levels and F has 6 levels. When I initially ran the model (with intercepts), I got 8 coefficients for B and 5 for F which I understood as the first level of each being included in the intercept. I want ranking the levels within B and F based on their coefficient so I added -1 after each factor to lock the intercept at 0 so that I could get coefficients for all levels. Call: lm(formula = dependent ~ a + B-1 + c + d

Aligning Data frame with missing values

阅读更多关于 Aligning Data frame with missing values

问题 I'm using a data frame with many NA values. While I'm able to create a linear model, I am subsequently unable to line the fitted values of the model up with the original data due to the missing values and lack of indicator column. Here's a reproducible example: library(MASS) dat <- Aids2 # Add NA's dat[floor(runif(100, min = 1, max = nrow(dat))),3] <- NA # Create a model model <- lm(death ~ diag + age, data = dat) # Different Values length(fitted.values(model)) # 2745 nrow(dat) # 2843 回答1:

use stepAIC on a list of models

阅读更多关于 use stepAIC on a list of models

问题 I want to do stepwise regression using AIC on a list of linear models. idea is to use e a list of linear models and then apply stepAIC on each list element. It fails. Hi guys I tried to track the problem down. I think I found the problem. However, I dont understand the cause. Try the code to see the difference between three cases. require(MASS) n<-30 x1<-rnorm(n, mean=0, sd=1) #create rv x1 x2<-rnorm(n, mean=1, sd=1) x3<-rnorm(n, mean=2, sd=1) epsilon<-rnorm(n,mean=0,sd=1) # random error

Fast pairwise simple linear regression between variables in a data frame

阅读更多关于 Fast pairwise simple linear regression between variables in a data frame

I have seen pairwise or general paired simple linear regression many times on Stack Overflow. Here is a toy dataset for this kind of problem. set.seed(0) X <- matrix(runif(100), 100, 5, dimnames = list(1:100, LETTERS[1:5])) b <- c(1, 0.7, 1.3, 2.9, -2) dat <- X * b[col(X)] + matrix(rnorm(100 * 5, 0, 0.1), 100, 5) dat <- as.data.frame(dat) pairs(dat) So basically we want to compute 5 * 4 = 20 regression lines: ----- A ~ B A ~ C A ~ D A ~ E B ~ A ----- B ~ C B ~ D B ~ E C ~ A C ~ B ----- C ~ D C ~ E D ~ A D ~ B D ~ C ----- D ~ E E ~ A E ~ B E ~ C E ~ D ----- Here is a poor man's strategy: poor <

Fixed effect in Pandas or Statsmodels

阅读更多关于 Fixed effect in Pandas or Statsmodels

问题 Is there an existing function to estimate fixed effect (one-way or two-way) from Pandas or Statsmodels. There used to be a function in Statsmodels but it seems discontinued. And in Pandas, there is something called plm , but I can't import it or run it using pd.plm() . 回答1: As noted in the comments, PanelOLS has been removed from Pandas as of version 0.20.0. So you really have three options: If you use Python 3 you can use linearmodels as specified in the more recent answer: https:/

Linear Regression with a known fixed intercept in R

阅读更多关于 Linear Regression with a known fixed intercept in R

问题 I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm() . I found an example on the internet and I tried to read the R-help "?lm" (unfortunately I'm not able to understand it), but I did not succeed. Can anyone tell me where my mistake is? lin <- data.frame(x = c(0:6), y = c(0.3, 0.1, 0.9, 3.1, 5, 4.9, 6.2)) plot (lin$x, lin$y) regImp = lm(formula = lin$x ~ lin$y) abline(regImp,

How to force R to use a specified factor level as reference in a regression?

阅读更多关于 How to force R to use a specified factor level as reference in a regression?

How can I tell R to use a certain level as reference if I use binary explanatory variables in a regression? It's just using some level by default. lm(x ~ y + as.factor(b)) with b {0, 1, 2, 3, 4} . Let's say I want to use 3 instead of the zero that is used by R. See the relevel() function. Here is an example: set.seed(123) x <- rnorm(100) DF <- data.frame(x = x, y = 4 + (1.5*x) + rnorm(100, sd = 2), b = gl(5, 20)) head(DF) str(DF) m1 <- lm(y ~ x + b, data = DF) summary(m1) Now alter the factor b in DF by use of the relevel() function: DF <- within(DF, b <- relevel(b, ref = 3)) m2 <- lm(y ~ x +