Quantile-Quantile Plot using SciPy

匿名 (未验证) 提交于 2019-12-03 02:13:02

问题:

How would you create a qq-plot using Python?

Assuming that you have a large set of measurements and are using some plotting function that takes XY-values as input. The function should plot the quantiles of the measurements against the corresponding quantiles of some distribution (normal, uniform...).

The resulting plot lets us then evaluate in our measurement follows the assumed distribution or not.

http://en.wikipedia.org/wiki/Quantile-quantile_plot

Both R and Matlab provide ready made functions for this, but I am wondering what the cleanest method for implementing in in Python would be.

回答1:

I think that scipy.stats.probplot will do what you want. See the documentation for more detail.

import numpy as np  import pylab  import scipy.stats as stats  measurements = np.random.normal(loc = 20, scale = 5, size=100)    stats.probplot(measurements, dist="norm", plot=pylab) pylab.show() 

Result



回答2:

Using qqplot of statsmodels.api is another option:

Very basic example:

import numpy as np import statsmodels.api as sm import pylab  test = np.random.normal(0,1, 1000)  sm.qqplot(test, line='45') pylab.show() 

Result:

Documentation and more example are here



回答3:

If you need to do a QQ plot of one sample vs. another, statsmodels includes qqplot_2samples(). Like Ricky Robinson in a comment above, this is what I think of as a QQ plot vs a probability plot which is a sample against a theoretical distribution.

http://statsmodels.sourceforge.net/devel/generated/statsmodels.graphics.gofplots.qqplot_2samples.html



回答4:

It exists now in the statsmodels package:

http://statsmodels.sourceforge.net/devel/generated/statsmodels.graphics.gofplots.qqplot.html



回答5:

I came up with this. Maybe you can improve it. Especially the method of generating the quantiles of the distribution seems cumbersome to me.

You could replace np.random.normal with any other distribution from np.random to compare data against other distributions.

#!/bin/python  import numpy as np  measurements = np.random.normal(loc = 20, scale = 5, size=100000)  def qq_plot(data, sample_size):     qq = np.ones([sample_size, 2])     np.random.shuffle(data)     qq[:, 0] = np.sort(data[0:sample_size])     qq[:, 1] = np.sort(np.random.normal(size = sample_size))     return qq  print qq_plot(measurements, 1000) 


回答6:

You can use bokeh

from bokeh.plotting import figure, show from scipy.stats import probplot # pd_series is the series you want to plot series1 = probplot(pd_series, dist="norm") p1 = figure(title="Normal QQ-Plot", background_fill_color="#E8DDCB") p1.scatter(series1[0][0],series1[0][1], fill_color="red") show(p1) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!