ECDF in python without step function?

那年仲夏 提交于 2019-12-10 10:47:06

问题


I have been using ECDF (empirical cumulative distribution function) from statsmodels.distributions to plot a CDF of some data. However, ECDF uses a step function and as a consequence I get jagged-looking plots.

So my question is: Do scipy or statsmodels have a ECDF baked-in without a step function?

By the way, I know I can do this:

hist, bin_edges = histogram(b_oz, normed=True)
plot(np.cumsum(hist))

but I don't get the right scales.

Thanks!


回答1:


If you just want to change the plot, then you could let matplotlib interpolate between the observed values.

>>> xx = np.random.randn(nobs)
>>> ecdf = sm.distributions.ECDF(xx)
>>> plt.plot(ecdf.x, ecdf.y)
[<matplotlib.lines.Line2D object at 0x07A872D0>]
>>> plt.show()

or sort original data and plot

>>> xx.sort()
>>> plt.plot(xx, ecdf(xx))
[<matplotlib.lines.Line2D object at 0x07A87090>]
>>> plt.show()

which is the same as plotting it directly

>>> a=0; plt.plot(xx, np.arange(1.,nobs+1)/(nobs+a))
[<matplotlib.lines.Line2D object at 0x07A87D30>]
>>> plt.show()

Note: depending on how you want the ecdf to behave at the boundaries and how it will be centered, there are different normalizations for "plotting positions" that are in common use, like the parameter a that I added as example a=1 is a common choice.

As alternative to using the empirical cdf, you could also use an interpolated or smoothed ecdf or histogram, or a kernel density estimate.



来源:https://stackoverflow.com/questions/14006520/ecdf-in-python-without-step-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!