linear-regression

How to only print (adjusted) R-squared of regression model?

半世苍凉 提交于 2020-01-16 18:29:42
问题 I am a beginner with R. I have a data set on air pollution. The columns are site, measured concentration and 80 variables (v1-v80) that might influence the concentration. I want to make a model with forward stepwise regression based on R-squared/adj with my own code (so I do not want to use something like step() or regsubset()). The dependent variable is concentration and the variables v1-v80 as independent variables. I wrote the following code for the first step (data set is simplified):

How to add a range to sklearn's linear regression predictions

倖福魔咒の 提交于 2020-01-16 11:58:26
问题 I wonder if there is a way to add a range to the predictions prior to fitting the model. The variable in question in my train data is technically a percentage score, but when I predict my test set, I get negative values or values >100. For now, I am manually normalizing the predictions list. I also used to cut off negatives and >100 and assign then a 0 and 100. However, it only makes sense if the fit function could be made aware of this constraint, right? Here is a sample row of the data:

Summary not working for OLS estimation

烈酒焚心 提交于 2020-01-14 08:17:32
问题 I am having an issue with my statsmodels OLS estimation. The model runs without any issues, but when I try to call for a summary so that I can see the actual results I get the TypeError of the axis needing to be specified when shapes of a and weights differ. My code looks like this: from __future__ import print_function, division import xlrd as xl import numpy as np import scipy as sp import pandas as pd import statsmodels.formula.api as smf import statsmodels.api as sm file_loc = "/Users

How do you remove an insignificant factor level from a regression using the lm() function in R?

别等时光非礼了梦想. 提交于 2020-01-13 04:28:25
问题 When I perform a regression in R and use type factor it helps me avoid setting up the categorical variables in the data. But how do I remove a factor that is not significant from the regression to just show significant variables? For example: dependent <- c(1:10) independent1 <- as.factor(c('d','a','a','a','a','a','a','b','b','c')) independent2 <- c(-0.71,0.30,1.32,0.30,2.78,0.85,-0.25,-1.08,-0.94,1.33) output <- lm(dependent ~ independent1+independent2) summary(output) Which results in the

Fminsearch Matlab (Non Linear Regression )

折月煮酒 提交于 2020-01-12 10:53:09
问题 Can anyone explain to me how I can apply non linear regression to this equation t find out K using the matlab command window. I = 10^-9(exp(38.68V/k)-1). Screenshot of Equation I have data values as follows: Voltage := [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]: Current:= [0, 0, 0, 0, 0, 0, 0, 0.07, 0.92, 12.02, 158.29]: Screenshot of Equation [NEW]: Now I used FminSearch as an alternative another and another error message appeared. Matrix dimensions must agree. Error in @(k)sum((I

Fminsearch Matlab (Non Linear Regression )

流过昼夜 提交于 2020-01-12 10:52:30
问题 Can anyone explain to me how I can apply non linear regression to this equation t find out K using the matlab command window. I = 10^-9(exp(38.68V/k)-1). Screenshot of Equation I have data values as follows: Voltage := [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]: Current:= [0, 0, 0, 0, 0, 0, 0, 0.07, 0.92, 12.02, 158.29]: Screenshot of Equation [NEW]: Now I used FminSearch as an alternative another and another error message appeared. Matrix dimensions must agree. Error in @(k)sum((I

Linear model singular because of large integer datetime in R?

前提是你 提交于 2020-01-11 12:09:11
问题 Simple regression of random normal on date fails, but identical data with small integers instead of dates works as expected. # Example dataset with 100 observations at 2 second intervals. set.seed(1) df <- data.frame(x=as.POSIXct("2017-03-14 09:00:00") + seq(0, 199, 2), y=rnorm(100)) #> head(df) # x y # 1 2017-03-14 09:00:00 -0.6264538 # 2 2017-03-14 09:00:02 0.1836433 # 3 2017-03-14 09:00:04 -0.8356286 # Simple regression model. m <- lm(y ~ x, data=df) The slope is missing due to

Aggregate linear regression

主宰稳场 提交于 2020-01-11 09:36:30
问题 Sorry I am quite new to R, but I have a dataframe with gamelogs for multiple players. I am trying to get the slope coefficient for each player's points over all of their games. I have seen that aggregate can use operators like sum and average , and getting coefficients off of a linear regression is pretty simple as well . How do I combine these? a <- c("player1","player1","player1","player2","player2","player2") b <- c(1,2,3,4,5,6) c <- c(15,12,13,4,15,9) gamelogs <- data.frame(name=a, game=b

Keras regression clip values

纵饮孤独 提交于 2020-01-11 09:25:10
问题 I want to clip values, how could I do that? I tried using this: from keras.backend.tensorflow_backend import clip from keras.layers.core import Lambda ... model.add(Dense(1)) model.add(Activation('linear')) model.add(Lambda(lambda x: clip(x, min_value=200, max_value=1000))) But it does not matter where I put my Lambda+clip, it does not affect anything? 回答1: It actually has to be implemented as loss, at the model.compile step. from keras import backend as K def clipped_mse(y_true, y_pred):

Calculating the number of dots lie above and below the regression line with R [closed]

北慕城南 提交于 2020-01-11 08:46:32
问题 It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 7 years ago . How do I calculate the number of dots that lie above and below the regression line on a scatter plot? data = read.csv("info.csv") par(pty = "s") plot(data$col1, data$col2, xlab = "xaxis", ylab = "yaxis", xlim = c