How to draw distribution plot for discrete variables in seaborn

。_饼干妹妹 提交于 2020-05-13 07:04:08


When I draw displot for discrete variables, the distribution might not be as what I think. For example.

We can find that there are crevices in the barplot so that the curve in kdeplot is "lower" in y axis.

In my work, it was even worse:

I think it may because the "width" or "weight" was not 1 for each bar. But I didn't find any parameter that can justify it.

I'd like to draw such curve (It should be more smooth)


If the problem is that there are some emptry bins in the histogram, it probably makes sense to specify the bins to match the data. In this case, use bins=np.arange(0,16) to get the bins for all integers in the data.

import numpy as np; np.random.seed(1)
import matplotlib.pyplot as plt
import seaborn as sns

n = np.random.randint(0,15,10000)
sns.distplot(n, bins=np.arange(0,16), hist_kws=dict(ec="k"))


One way to deal with this problem might be to adjust the "bandwidth" of the KDE (see the documentation for seaborn.kdeplot())

n = np.round(np.random.normal(5,2,size=(10000,)))
sns.distplot(n, kde_kws={'bw':1})

EDIT Here is an alternative with a different scale for the bars and the KDE

n = np.round(np.random.normal(5,2,size=(10000,)))
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

sns.distplot(n, kde=False, ax=ax1)
sns.distplot(n, hist=False, ax=ax2, kde_kws={'bw':1})

