Matplotlib: How to convert a histogram to a discrete probability mass function?

人盡茶涼 提交于 2019-12-04 13:10:31

The reason is norm=True gives the probability density function. In probability theory, a probability density function or density of a continuous random variable, describes the relative likelihood for this random variable to take on a given value.

Let us consider a very simple example.

x=np.arange(0.1,1.1,0.1)
array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

# Bin size
bins = np.arange(0.05, 1.15, 0.1)
np.histogram(x,bins=bins,normed=1)[0]
[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]
np.histogram(x,bins=bins,normed=0)[0]/float(len(x))
[ 0.1,  0.1,  0.1,  0.1,  0.1,  0.1,  0.1,  0.1,  0.1,  0.1]

# Change the bin size
bins = np.arange(0.05, 1.15, 0.2)
np.histogram(x,bins=bins,normed=1)[0]
[ 1.,  1.,  1.,  1.,  1.]
np.histogram(x,bins=bins,normed=0)[0]/float(len(x))
[ 0.2,  0.2,  0.2,  0.2,  0.2]

As, you can see in the above, the probability that x will lie between [0.05-0.15] or [0.15-0.25] is 1/10 whereas if you change the bin size to 0.2 then the probability that it will lie between [0.05-0.25] or [0.25-0.45] is 1/5. Now these actual probability values are dependent on the bin-size, however, the probability density is independent of the bins size. Thus, this is the only proper way to do the above, otherwise one would need to state the bin-width in each of the plot.

So in your case if you really want to plot the probability value at each bin (and not the probability density) then you can simply divide the frequency of each histogram by the number of total elements. However, I would suggest you not to do this unless you are working with discrete variables and each of your bins represent a single possible value of this variable.

Plotting a Continuous Probability Function(PDF) from a Histogram – Solved in Python. refer this blog for detailed explanation. (http://howdoudoittheeasiestway.blogspot.com/2017/09/plotting-continuous-probability.html) Else you can use the code below.

n, bins, patches = plt.hist(A, 40, histtype='bar')
plt.show()
n = n/len(A)
n = np.append(n, 0)
mu = np.mean(n)
sigma = np.std(n)
plt.bar(bins,n, width=(bins[len(bins)-1]-bins[0])/40)
y1= (1/(sigma*np.sqrt(2*np.pi))*np.exp(-(bins - mu)**2 /(2*sigma**2)))*0.03
plt.plot(bins, y1, 'r--', linewidth=2)
plt.show()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!