问题
When I draw displot
for discrete variables, the distribution might not be as what I think. For example.
We can find that there are crevices in the barplot
so that the curve in kdeplot
is "lower" in y axis.
In my work, it was even worse:
I think it may because the "width" or "weight" was not 1 for each bar. But I didn't find any parameter that can justify it.
I'd like to draw such curve (It should be more smooth)
回答1:
If the problem is that there are some emptry bins in the histogram, it probably makes sense to specify the bins to match the data. In this case, use bins=np.arange(0,16)
to get the bins for all integers in the data.
import numpy as np; np.random.seed(1)
import matplotlib.pyplot as plt
import seaborn as sns
n = np.random.randint(0,15,10000)
sns.distplot(n, bins=np.arange(0,16), hist_kws=dict(ec="k"))
plt.show()
回答2:
One way to deal with this problem might be to adjust the "bandwidth" of the KDE (see the documentation for seaborn.kdeplot())
n = np.round(np.random.normal(5,2,size=(10000,)))
sns.distplot(n, kde_kws={'bw':1})
EDIT Here is an alternative with a different scale for the bars and the KDE
n = np.round(np.random.normal(5,2,size=(10000,)))
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
sns.distplot(n, kde=False, ax=ax1)
sns.distplot(n, hist=False, ax=ax2, kde_kws={'bw':1})
来源:https://stackoverflow.com/questions/48990594/how-to-draw-distribution-plot-for-discrete-variables-in-seaborn