How to plot statsmodels linear regression (OLS) cleanly

匿名 (未验证) 提交于 2019-12-03 02:59:02

问题:

Problem Statement:

I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it:

Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried statsmodels' plot_fit method, but the plot is a little funky:

I was hoping to get a horizontal line which represents the actual result of the regression.

Statsmodels has a variety of methods for plotting regression (a few more details about them here) but none of them seem to be the super simple "just plot the regression line on top of your data" -- plot_fit seems to be the closest thing.

Questions:

  • The first picture above is from pandas' plot function, which returns a matplotlib.axes._subplots.AxesSubplot. Can I overlay a regression line easily onto that plot?
  • Is there a function in statsmodels I've overlooked?
  • Is there a better way to put together this figure?

Two related questions:

Neither seems to have a good answer.

Sample data

As requested by @IgorRaush

        motifScore  expression 6870    1.401123    0.55 10456   1.188554    -1.58 12455   1.476361    -1.75 18052   1.805736    0.13 19725   1.110953    2.30 30401   1.744645    -0.49 30716   1.098253    -1.59 30771   1.098253    -2.04

abline_plot

I had tried this, but it doesn't seem to work... not sure why:

回答1:

As I mentioned in the comments, seaborn is a great choice for statistical data visualization.

import seaborn as sns  sns.regplot(x='motifScore', y='expression', data=motif)


Alternatively, you can use statsmodels.regression.linear_model.OLS and manually plot a regression line.

import statsmodels.api as sm  # regress "expression" onto "motifScore" (plus an intercept) model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore)) p = model.fit().params  # generate x-values for your regression line (two is sufficient) x = np.arange(1, 3)  # scatter-plot data ax = df.plot(x='motifScore', y='expression', kind='scatter')  # plot regression line on the same axes, set x-axis limits ax.plot(x, p.const + p.motifScore * x) ax.set_xlim([1, 2])


Yet another solution is statsmodels.graphics.regressionplots.abline_plot which takes away some of the boilerplate from the above approach.

import statsmodels.api as sm from statsmodels.graphics.regressionplots import abline_plot  # regress "expression" onto "motifScore" (plus an intercept) model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore))  # scatter-plot data ax = df.plot(x='motifScore', y='expression', kind='scatter')  # plot regression line abline_plot(model_results=model.fit(), ax=ax)



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!