linear-regression

Plotting conditional density of prediction after linear regression

情到浓时终转凉″ 提交于 2019-12-12 20:09:30
问题 This is my data frame: data <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1.295, 1.438, -0.638, 0.716, 1.004, -1.328, -1.759, -1.315, 1.053, 1.958, -2.034, 2.936,

R :How to get a proper latex regression table from a dataframe?

℡╲_俬逩灬. 提交于 2019-12-12 17:50:00
问题 Consider the following example inds <- c('var1','','var2','') model1 <- c(10.2,0.00,0.02,0.3) model2 <- c(11.2,0.01,0.02,0.023) df = df=data.frame(inds,model1,model2) df inds model1 model2 var1 10.20 11.200 0.00 0.010 var2 0.02 0.020 0.30 0.023 Here you have the output of a custom regression model with coefficients and P-values (I actually can show any other statistics if I need to, say, the standard errors of the coefficients). There are two variables, var1 and var2 . For instance, in model1

R: Error in contrasts when fitting linear models with `lm`

徘徊边缘 提交于 2019-12-12 17:16:47
问题 I've found Error in contrasts when defining a linear model in R and have followed the suggestions there, but none of my factor variables take on only one value and I am still experiencing the same issue. This is the dataset I'm using: https://www.dropbox.com/s/em7xphbeaxykgla/train.csv?dl=0. This is the code I'm trying to run: simplelm <- lm(log_SalePrice ~ ., data = train) #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2

ggplot2: Plotting regression lines with different intercepts but with same slope

℡╲_俬逩灬. 提交于 2019-12-12 14:06:52
问题 I want to plot regression lines with different intercepts but with the same slope. With the following ggplot2 code, I can plot regression lines with different intercepts and different slopes. But could not figured out how to draw regression lines with different different intercepts but the same slopes. library(ggplot2) ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() + geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))

AttributeError: LinearRegression object has no attribute 'coef_'

时光毁灭记忆、已成空白 提交于 2019-12-12 10:31:44
问题 I've been attempting to fit this data by a Linear Regression, following a tutorial on bigdataexaminer. Everything was working fine up until this point. I imported LinearRegression from sklearn, and printed the number of coefficients just fine. This was the code before I attempted to grab the coefficients from the console. import numpy as np import pandas as pd import scipy.stats as stats import matplotlib.pyplot as plt import sklearn from sklearn.datasets import load_boston from sklearn

Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

只愿长相守 提交于 2019-12-12 10:26:52
问题 When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss. Here is my code: import tensorflow as tf w = tf.Variable(tf.zeros([1]), tf.float32) b = tf.Variable(tf.zeros([1]), tf.float32) x = tf.placeholder(tf.float32) y = tf.placeholder(tf.float32) liner = w*x+b loss = tf.reduce_sum(tf.square(liner-y)) train = tf.train.GradientDescentOptimizer(1).minimize(loss) sess = tf.Session() x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000] y_data =

Extracting the terminal nodes of each tree associated with a new observation

寵の児 提交于 2019-12-12 10:20:12
问题 I would like to extract the terminal nodes of the random forest R implementation. As I have understood random forest, you have a sequence of orthogonal trees. When you predict a new observation (In regression), it enters all these trees and then you average the prediction of each individual tree. If I wanted to not average but maybe do a linear regression with these corresponding observations I would need, say, a list of the observations that are "associated" with this new observation. I have

Covariance matrix from np.polyfit() has negative diagonal?

社会主义新天地 提交于 2019-12-12 09:27:23
问题 Problem: the cov=True option of np.polyfit() produces a diagonal with non-sensical negative values. UPDATE: after playing with this some more, I am really starting to suspect a bug in numpy ? Is that possible? Deleting any pair of 13 values from the dataset will fix the problem. I am using np.polyfit() to calculate the slope and intercept coefficients of a dataset. A plot of the values produces a very linear (but not perfectly) linear graph. I am attempting to get the standard deviation on

R: plm — year fixed effects — year and quarter data

眉间皱痕 提交于 2019-12-12 08:35:28
问题 I am having a problem setting up a panel data model. Here is some sample data: library(plm) id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2) year <- c(1999,1999,1999,1999,2000,2000,2000,2000,1999,1999,1999,1999,2000,2000,2000,2000) qtr <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4) y <- rnorm(16, mean=0, sd=1) x <- rnorm(16, mean=0, sd=1) data <- data.frame(id=id,year=year,qtr=qtr,y_q=paste(year,qtr,sep="_"),y=y,x=x) I run the following regression using 'id' as the individual index and 'year' as the time

How can I see multiple variable's outlier in one boxplot using R?

怎甘沉沦 提交于 2019-12-12 06:28:41
问题 I am a newbie to R. I have a question. For checking the outlier of a variable we generally use: boxplot(train$rate) Suppose, the rate is the variable of my datasets and train is my data sets name. But when I have multiple variables like 100 or 150 variables, then it will be very time consuming to check one by one variable's outlier. Is there any function to bring the 100 variables' outlier in one boxplot? If yes, then which function is used to remove those variable's outlier at one time