可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
How would you create a qq-plot using Python?
Assuming that you have a large set of measurements and are using some plotting function that takes XY-values as input. The function should plot the quantiles of the measurements against the corresponding quantiles of some distribution (normal, uniform...).
The resulting plot lets us then evaluate in our measurement follows the assumed distribution or not.
http://en.wikipedia.org/wiki/Quantile-quantile_plot
Both R and Matlab provide ready made functions for this, but I am wondering what the cleanest method for implementing in in Python would be.
回答1:
I think that scipy.stats.probplot
will do what you want. See the documentation for more detail.
import numpy as np import pylab import scipy.stats as stats measurements = np.random.normal(loc = 20, scale = 5, size=100) stats.probplot(measurements, dist="norm", plot=pylab) pylab.show()
Result
回答2:
Using qqplot
of statsmodels.api
is another option:
Very basic example:
import numpy as np import statsmodels.api as sm import pylab test = np.random.normal(0,1, 1000) sm.qqplot(test, line='45') pylab.show()
Result:
Documentation and more example are here
回答3:
If you need to do a QQ plot of one sample vs. another, statsmodels includes qqplot_2samples(). Like Ricky Robinson in a comment above, this is what I think of as a QQ plot vs a probability plot which is a sample against a theoretical distribution.
http://statsmodels.sourceforge.net/devel/generated/statsmodels.graphics.gofplots.qqplot_2samples.html
回答4:
回答5:
I came up with this. Maybe you can improve it. Especially the method of generating the quantiles of the distribution seems cumbersome to me.
You could replace np.random.normal
with any other distribution from np.random
to compare data against other distributions.
#!/bin/python import numpy as np measurements = np.random.normal(loc = 20, scale = 5, size=100000) def qq_plot(data, sample_size): qq = np.ones([sample_size, 2]) np.random.shuffle(data) qq[:, 0] = np.sort(data[0:sample_size]) qq[:, 1] = np.sort(np.random.normal(size = sample_size)) return qq print qq_plot(measurements, 1000)
回答6:
You can use bokeh
from bokeh.plotting import figure, show from scipy.stats import probplot # pd_series is the series you want to plot series1 = probplot(pd_series, dist="norm") p1 = figure(title="Normal QQ-Plot", background_fill_color="#E8DDCB") p1.scatter(series1[0][0],series1[0][1], fill_color="red") show(p1)