regression

How do you remove an insignificant factor level from a regression using the lm() function in R?

為{幸葍}努か 提交于 2019-12-04 13:35:35
When I perform a regression in R and use type factor it helps me avoid setting up the categorical variables in the data. But how do I remove a factor that is not significant from the regression to just show significant variables? For example: dependent <- c(1:10) independent1 <- as.factor(c('d','a','a','a','a','a','a','b','b','c')) independent2 <- c(-0.71,0.30,1.32,0.30,2.78,0.85,-0.25,-1.08,-0.94,1.33) output <- lm(dependent ~ independent1+independent2) summary(output) Which results in the following regression model: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.6180 1.0398

Subsetting in dredge (MuMIn) - must include interaction if main effects are present

我的未来我决定 提交于 2019-12-04 13:14:19
I'm doing some exploratory work where I use dredge{MuMIn}. In this procedure there are two variables that I want to set to be allowed together ONLY when the interaction between them is present, i.e. they can not be present together only as main effects. Using sample data: I want to dredge the model fm1 (disregarding that it probably doesn't make sense). If the variables GNP and Population appear together, they must also include the interaction between them. require(stats); require(graphics) ## give the data set in the form it is used in S-PLUS: longley.x <- data.matrix(longley[, 1:6]) longley

Advice on calculating a function to describe upper bound of data

我是研究僧i 提交于 2019-12-04 11:33:46
I have a scatter plot of a dataset and I am interested in calculating the upper bound of the data. I don't know if this is a standard statistical approach so what I was considering doing was splitting the X-axis data into small ranges, calculating the max for these ranges and then trying to identify a function to describe these points. Is there a function already in R to do this? If it's relevant there are 92611 points. You might like to look into quantile regression, which is available in the quantreg package. Whether this is useful will depend on whether you want the absolute maximum within

c# LOESS/LOWESS regression

前提是你 提交于 2019-12-04 10:32:46
Do you know of a .net library to perform a LOESS/LOWESS regression? (preferably free/open source) Nestor Port from java to c# public class LoessInterpolator { public static double DEFAULT_BANDWIDTH = 0.3; public static int DEFAULT_ROBUSTNESS_ITERS = 2; /** * The bandwidth parameter: when computing the loess fit at * a particular point, this fraction of source points closest * to the current point is taken into account for computing * a least-squares regression. * * A sensible value is usually 0.25 to 0.5. */ private double bandwidth; /** * The number of robustness iterations parameter: this

Multiple Linear Regression function in SQL Server

旧街凉风 提交于 2019-12-04 10:16:25
I have developed Simple Linear regression function in SQL Server from here ( https://ask.sqlservercentral.com/questions/96778/can-this-linear-regression-algorithm-for-sql-serve.html ) to calculate Alpha,Beta and some extra values like Upper 95% and Lower 95%. The Simple Linear regression takes the argument as X and y. Now I am in need of perform Multiple Linear regression SQL Server, which takes arguments y and X1,X2,X3,.....Xn Hence the Output will be as follows: Coefficients Standard Error t Stat P-value Lower 95% Upper 95% +-------------------------------------------------------------------

How to set the Coefficient Value in Regression; R

↘锁芯ラ 提交于 2019-12-04 09:58:25
I'm looking for a way to specify the value of a predictor variable. When I run a glm with my current data, the coefficient for one of my variables is close to one. I'd like to set it at .8. I know this will give me a lower R^2 value, but I know a priori that the predictive power of the model will be greater. The weights component of glm looks promising, but I haven't figured it out yet. Any help would be greatly appreciated. I believe you are looking for the offset argument in glm . So for example, you might do something like this: glm(y ~ x1, offset = x2,...) where in this case the

Polynomial Regression nonsense Predictions

寵の児 提交于 2019-12-04 09:45:24
This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . Learn more . Suppose I want to fit a linear regression model with degree two (orthogonal) polynomial and then predict the response. Here are the codes for the first model (m1) x=1:100 y=-2+3*x-5*x^2+rnorm(100) m1=lm(y~poly(x,2)) prd.1=predict(m1,newdata=data.frame(x=105:110)) Now let's try the same model but instead of using $poly(x,2)$, I will use its columns like: m2=lm(y~poly(x,2)[,1]+poly(x,2)[,2]) prd.2=predict(m2,newdata=data.frame(x=105:110)) Let's look at the summaries

how can I do a maximum likelihood regression using scipy.optimize.minimize

自闭症网瘾萝莉.ら 提交于 2019-12-04 09:34:12
问题 How can I do a maximum likelihood regression using scipy.optimize.minimize ? I specifically want to use the minimize function here, because I have a complex model and need to add some constraints. I am currently trying a simple example using the following: from scipy.optimize import minimize def lik(parameters): m = parameters[0] b = parameters[1] sigma = parameters[2] for i in np.arange(0, len(x)): y_exp = m * x + b L = sum(np.log(sigma) + 0.5 * np.log(2 * np.pi) + (y - y_exp) ** 2 / (2 *

How can I omit interactions using stargazer or xtable?

随声附和 提交于 2019-12-04 08:19:40
Is it possible to omit interactions in stargazer using the omit option? Normally I would write the variable name into the omit=c('varname') but in the case of an interaction I do not know what to write. Any hints on that? How do you solve this problem in other packages like xtable ? \documentclass{article} \begin{document} %Load dataset and run regression << lm, echo=FALSE >>= load('dataset.RData') library(stargazer) lm1 <- lm(y~ x + factor(v)*z ,data=dataset) @ << table_texstyle, echo=FALSE, comment=NA, results='asis' >>= stargazer(lm1 ,omit=c('???'), omit.labels=c('Omitted interactions'),

difference between LinearRegression and svm.SVR(kernel=“linear”)

旧巷老猫 提交于 2019-12-04 08:18:16
First there are questions on this forum very similar to this one but trust me none matches so no duplicating please. I have encountered two methods of linear regression using scikit's sklearn and I am failing to understand the difference between the two, especially where in first code there's a method train_test_split() called while in the other one directly fit method is called. I am studying with multiple resources and this single issue is very confusing to me. First which uses SVR X = np.array(df.drop(['label'], 1)) X = preprocessing.scale(X) y = np.array(df['label']) X_train, X_test, y