linear-regression | 易学教程

Plotting conditional density of prediction after linear regression

阅读更多关于 Plotting conditional density of prediction after linear regression

问题 This is my data frame: data <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1.295, 1.438, -0.638, 0.716, 1.004, -1.328, -1.759, -1.315, 1.053, 1.958, -2.034, 2.936,

R :How to get a proper latex regression table from a dataframe?

阅读更多关于 R :How to get a proper latex regression table from a dataframe?

问题 Consider the following example inds <- c('var1','','var2','') model1 <- c(10.2,0.00,0.02,0.3) model2 <- c(11.2,0.01,0.02,0.023) df = df=data.frame(inds,model1,model2) df inds model1 model2 var1 10.20 11.200 0.00 0.010 var2 0.02 0.020 0.30 0.023 Here you have the output of a custom regression model with coefficients and P-values (I actually can show any other statistics if I need to, say, the standard errors of the coefficients). There are two variables, var1 and var2 . For instance, in model1

R: Error in contrasts when fitting linear models with `lm`

阅读更多关于 R: Error in contrasts when fitting linear models with `lm`

问题 I've found Error in contrasts when defining a linear model in R and have followed the suggestions there, but none of my factor variables take on only one value and I am still experiencing the same issue. This is the dataset I'm using: https://www.dropbox.com/s/em7xphbeaxykgla/train.csv?dl=0. This is the code I'm trying to run: simplelm <- lm(log_SalePrice ~ ., data = train) #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2

ggplot2: Plotting regression lines with different intercepts but with same slope

阅读更多关于 ggplot2: Plotting regression lines with different intercepts but with same slope

问题 I want to plot regression lines with different intercepts but with the same slope. With the following ggplot2 code, I can plot regression lines with different intercepts and different slopes. But could not figured out how to draw regression lines with different different intercepts but the same slopes. library(ggplot2) ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() + geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))

AttributeError: LinearRegression object has no attribute 'coef_'

阅读更多关于 AttributeError: LinearRegression object has no attribute 'coef_'

问题 I've been attempting to fit this data by a Linear Regression, following a tutorial on bigdataexaminer. Everything was working fine up until this point. I imported LinearRegression from sklearn, and printed the number of coefficients just fine. This was the code before I attempted to grab the coefficients from the console. import numpy as np import pandas as pd import scipy.stats as stats import matplotlib.pyplot as plt import sklearn from sklearn.datasets import load_boston from sklearn

Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

阅读更多关于 Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

问题 When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss. Here is my code: import tensorflow as tf w = tf.Variable(tf.zeros([1]), tf.float32) b = tf.Variable(tf.zeros([1]), tf.float32) x = tf.placeholder(tf.float32) y = tf.placeholder(tf.float32) liner = w*x+b loss = tf.reduce_sum(tf.square(liner-y)) train = tf.train.GradientDescentOptimizer(1).minimize(loss) sess = tf.Session() x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000] y_data =

Extracting the terminal nodes of each tree associated with a new observation

阅读更多关于 Extracting the terminal nodes of each tree associated with a new observation

问题 I would like to extract the terminal nodes of the random forest R implementation. As I have understood random forest, you have a sequence of orthogonal trees. When you predict a new observation (In regression), it enters all these trees and then you average the prediction of each individual tree. If I wanted to not average but maybe do a linear regression with these corresponding observations I would need, say, a list of the observations that are "associated" with this new observation. I have

Covariance matrix from np.polyfit() has negative diagonal?

阅读更多关于 Covariance matrix from np.polyfit() has negative diagonal?

问题 Problem: the cov=True option of np.polyfit() produces a diagonal with non-sensical negative values. UPDATE: after playing with this some more, I am really starting to suspect a bug in numpy ? Is that possible? Deleting any pair of 13 values from the dataset will fix the problem. I am using np.polyfit() to calculate the slope and intercept coefficients of a dataset. A plot of the values produces a very linear (but not perfectly) linear graph. I am attempting to get the standard deviation on

R: plm — year fixed effects — year and quarter data

阅读更多关于 R: plm — year fixed effects — year and quarter data

问题 I am having a problem setting up a panel data model. Here is some sample data: library(plm) id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2) year <- c(1999,1999,1999,1999,2000,2000,2000,2000,1999,1999,1999,1999,2000,2000,2000,2000) qtr <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4) y <- rnorm(16, mean=0, sd=1) x <- rnorm(16, mean=0, sd=1) data <- data.frame(id=id,year=year,qtr=qtr,y_q=paste(year,qtr,sep="_"),y=y,x=x) I run the following regression using 'id' as the individual index and 'year' as the time

How can I see multiple variable's outlier in one boxplot using R?

阅读更多关于 How can I see multiple variable's outlier in one boxplot using R?

问题 I am a newbie to R. I have a question. For checking the outlier of a variable we generally use: boxplot(train$rate) Suppose, the rate is the variable of my datasets and train is my data sets name. But when I have multiple variables like 100 or 150 variables, then it will be very time consuming to check one by one variable's outlier. Is there any function to bring the 100 variables' outlier in one boxplot? If yes, then which function is used to remove those variable's outlier at one time