linear-regression | 易学教程

MM robust estimation in ggplot2 using stat_smooth with method = “rlm”

阅读更多关于 MM robust estimation in ggplot2 using stat_smooth with method = “rlm”

问题 The function rlm (MASS) permits both M and MM estimation for robust regression. I would like to plot the smoother from MM robust regression in ggplot2, however I think that when selecting method = "rlm" in stat_smooth, the estimation method automatically chosen is the M type. Is there any way of selecting the MM type estimation technique for the rlm function through ggplot2? Here is my code: df <- data.frame("x"=c(119,118,144,127,78.8,98.4,108,50,74,30.4, 50,72,99,155,113,144,102,131,105,127

Weighted Non-negative Least Square Linear Regression in python [closed]

阅读更多关于 Weighted Non-negative Least Square Linear Regression in python [closed]

Closed. This question is off-topic . It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 years ago . I know there is an weighted OLS solver , and a constrained OLS solver . Is there a routine that combines the two? You can simulate OLS weighting by modifying the X and y inputs. In OLS, you solve β for X t X β = X t y . In Weighted OLS, you solve X t X W β = X t W y . where W is a diagonal matrix with nonnegative entries. It follows that W 0.5 exists, and you can formulate this as (X W 0.5 ) t (XW 0.5 ) β = (X W

negative value for “mean_squared_error”

阅读更多关于 negative value for “mean_squared_error”

I am using scikit and using mean_squared_error as a scoring function for model evaluation in cross_val_score. rms_score = cross_validation.cross_val_score(model, X, y, cv=20, scoring='mean_squared_error') I am using mean_squared_error as it is a regression problem and the estimators (model) used are lasso , ridge and elasticNet . For all these estimators, I am getting rms_score as negative values. How is it possible, given the fact that the differences in y values are squared. You get the mean_squared_error with sign flipped returned by cross_validation.cross_val_score. There is an issued

Piecewise regression with a straight line and a horizontal line joining at a break point

阅读更多关于 Piecewise regression with a straight line and a horizontal line joining at a break point

问题 I want to do a piecewise linear regression with one break point, where the 2nd half of the regression line has slope = 0 . There are examples of how to do a piecewise linear regression, such as here. The problem I'm having is I'm not clear how to fix the slope of half of the model to be 0. I tried lhs <- function(x) ifelse(x < k, k-x, 0) rhs <- function(x) ifelse(x < k, 0, x-k) fit <- lm(y ~ lhs(x) + rhs(x)) where k is the break point, but the segment on the right is not a flat / horizontal

Plotting Pandas OLS linear regression results

阅读更多关于 Plotting Pandas OLS linear regression results

问题 How would I plot my linear regression results for this linear regression I did from pandas? import pandas as pd from pandas.stats.api import ols df = pd.read_csv('Samples.csv', index_col=0) control = ols(y=df['Control'], x=df['Day']) one = ols(y=df['Sample1'], x=df['Day']) two = ols(y=df['Sample2'], x=df['Day']) I tried plot() but it did not work. I want to plot all three samples on one plot are there any pandas code or matplotlib code to hadle data in the format of these summaries? Anyways

Fast group-by simple linear regression

阅读更多关于 Fast group-by simple linear regression

问题 This Q & A arises from How to make group_by and lm fast? where OP was trying to do a simple linear regression per group for a large data frame. In theory, a series of group-by regression y ~ x | g is equivalent to a single pooled regression y ~ x * g . The latter is very appealing because statistical test between different groups is straightforward. But in practice doing this larger regression is not computationally easy. My answer on the linked Q & A reviews packages speedlm and glm4 , but

Multiple linear regression with python

阅读更多关于 Multiple linear regression with python

问题 I would like to calculate multiple linear regression with python. I found this code for simple linear regression import numpy as np from matplotlib.pyplot import * x = np.array([1, 2, 3, 4, 5]) y = np.array([2, 3, 4, 4, 5]) n = np.max(x.shape) X = np.vstack([np.ones(n), x]).T a = np.linalg.lstsq(X, y)[0] So, a is the coefficient, but I don't see what [0] means ? And how can I change the code to obtain multiple linear regressions ? 回答1: To implement multiple linear regression with python you

Fitting a curve with a pivot point Python

阅读更多关于 Fitting a curve with a pivot point Python

I have the plot below and I want to fit it with 2 lines. Using python I manage to fit the upper part: def func(x,a,b): x=np.array(x) return a*(x**b) popt,pcov=curve_fit(func,up_x,up_y) And I want to fit the lower part with another line, but I want the line to pass through the point where the red one stars, so I can have a continuous function. So my question is how can I use curve_fit by giving a point the function has to pass through, but leaving the slope of the line to be calculated by python? (Or any other python package able to do it) A possible stepwise parametrisation of your model in

Python Parallel Computing - Scoop

阅读更多关于 Python Parallel Computing - Scoop

I am trying to get familiar with the library Scoop (documentation here: https://media.readthedocs.org/pdf/scoop/0.7/scoop.pdf ) to learn how to perform statistical computations in parallel, using in particular the futures.map function. As such, at first, I would like to try to run a simple linear regression, and assess the difference in performance between serial and parallel computations, using 10000000 data point (4 features, 1 target variable) randomly generated from a Normal Distribution. This is my code: import pandas as pd import numpy as np import random from scoop import futures import

Seaborn: annotate the linear regression equation

阅读更多关于 Seaborn: annotate the linear regression equation

问题 I tried fitting an OLS for Boston data set. My graph looks like below. How to annotate the linear regression equation just above the line or somewhere in the graph? How do I print the equation in Python? I am fairly new to this area. Exploring python as of now. If somebody can help me, it would speed up my learning curve. Many thanks! I tried this as well. My problem is - how to annotate the above in the graph in equation format? 回答1: You can use coefficients of linear fit to make a legend