regression

Constrained regression in Python

泪湿孤枕 提交于 2019-12-24 08:54:58
问题 I have this simple regression model: y = a + b * x + c * z + error with a constraint on parameters: c = b - 1 There are similar questions posted on SO (like Constrained Linear Regression in Python). However, the constraints' type is lb <= parameter =< ub . What are the available options to handle this specific constrained linear regression problem? 回答1: This is how it can be done using GLM: import statsmodels import statsmodels.api as sm import numpy as np # Set the link function to identity

Error in a bivariate logistic model in R

不羁岁月 提交于 2019-12-24 08:04:18
问题 I have an unexpected error in my research. Let me show you several code chunks from my research. Hope, you'll help me. I have two binary variables: alco and smoke that were generated like this: smoke<- factor(with(df, ifelse((q34<2),1,0))) alco<-factor(with(df, ifelse((q47==1), 1,0))) df<- cbind(df, smoke, alco, educ_3, smoke_14) I tried to analyse a model using zeligverse package m3<-zelig(cbind(smoke,alco) ~ fem+age+age2+smoke_14+ninc, model = "blogit", data = df) that lead to the mistake

Regression using Python

╄→尐↘猪︶ㄣ 提交于 2019-12-24 07:17:46
问题 I have the following variables: import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split np.random.seed(0) n = 15 x = np.linspace(0,10,n) + np.random.randn(n)/5 y = np.sin(x)+x/6 + np.random.randn(n)/10 X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0) def part1_scatter(): %matplotlib notebook plt.figure() plt.scatter(X_train, y_train, label='training data') plt.scatter(X_test, y_test, label='test data'

Is there an implementation of loess in R with more than 3 parametric predictors or a trick to a similar effect?

北城余情 提交于 2019-12-24 04:48:06
问题 Calling all experts on local regression and/or R ! I have run into a limitation of the standard loess function in R and hope you have some advice. The current implementation supports only 1-4 predictors . Let me set out our application scenario to show why this can easily become a problem as soon as we want to employ globally fit parametric covariables . Essentially, we have a spatial distortion s(x,y) overlaid over a number of measurements z : z_i = s(x_i,y_i) + v_{g_i} These measurements z

Easily performing the same regression on different datasets

点点圈 提交于 2019-12-24 04:44:30
问题 I'm performing the same regression on several different datasets (same dependent and independe variables). However, there are many independent variables, and I often want to test adding/removing different variables. I'd like to avoid making all these changes to different lines of code, just because they use different datasets. Can I instead just copy the formula that was used to create some object, and then create a new object using a different dataset? For example, something like: fit1 <- lm

ggplot2: How to plot an orthogonal regression line?

拟墨画扇 提交于 2019-12-24 04:29:34
问题 I have tested a large sample of participants on two different tests of visual perception – now, I'd like to see to what extent performance on both tests correlates. To visualise the correlation, I plot a scatterplot in R using ggplot() and I fit a regression line (using stat_smooth() ). However, since both my x and y variable are performance measures, I need to take both of them into account when fitting my regression line – thus, I cannot use a simple linear regression (using stat_smooth

Python Statsmodels: OLS regressor not predicting

99封情书 提交于 2019-12-24 02:19:14
问题 I wrote the following piece of code but I just cannot get the 'predict' method to work: import statsmodels.api as sm from statsmodels.formula.api import ols ols_model = ols('Consumption ~ Disposable_Income', df).fit() My 'df' is a pandas dataframe with column headings 'Consumption' and 'Disposable_Income'. When I run, for example, ols_model.predict([1000.0]) I get: "TypeError: list indices must be integers, not str" When I run, for example, ols_model.predict(df['Disposable_Income'].values) I

Unable to get R-squared for test dataset

旧巷老猫 提交于 2019-12-24 01:27:49
问题 I am trying to learn a bit about different types of regression and I am hacking my way through the code sample below. library(magrittr) library(dplyr) # Polynomial degree 1 df=read.csv("C:\\path_here\\auto_mpg.csv",stringsAsFactors = FALSE) # Data from UCI df1 <- as.data.frame(sapply(df,as.numeric)) # Select key columns df2 <- df1 %>% select(cylinder,displacement,horsepower,weight,acceleration,year,mpg) df3 <- df2[complete.cases(df2),] smp_size <- floor(0.75 * nrow(df3)) # Split as train and

How to predict a new value using simple linear regression log(y)=b0+b1*log(x)

随声附和 提交于 2019-12-24 00:24:22
问题 How to predict a new given value of body using the ml2 model below, and interpret its output (new predicted output only, not model) Using Animals dataset from MASS package to build a simple linear regression model ml2<-lm(log(brain)~log(body),data=Animals) predict a new given body of 468 pred_body<-data.frame(body=c(468)) predict(ml2,new, interval="confidence") fit lwr upr 1 5.604506 4.897498 6.311513 But i am not so sure predicted y(brain) =5.6 or log(brain)=5.6? How could we get the

How to plot confidence bands for my weighted log-log linear regression?

橙三吉。 提交于 2019-12-23 23:18:11
问题 I need to plot an exponential species-area relationship using the exponential form of a weighted log-log linear model, where mean species number per location/Bank ( sb$NoSpec.mean ) is weighted by the variance in species number per year ( sb$NoSpec.var ). I am able to plot the fit, but have issues figuring out how to plot the confidence intervals around this fit. The following is the best I have come up with so far. Any advice for me? # Data df <- read.csv("YearlySpeciesCount_SizeGroups.csv")