heatmap-like plot, but for categorical variables in seaborn

醉酒当歌 提交于 2020-01-24 02:21:12

问题


Same question as heatmap-like plot, but for categorical variables but using python and seaborn instead of R:

Imagine I have the following dataframe:

df = pd.DataFrame({"John":"No Yes Maybe".split(),
                   "Elly":"Yes Yes Yes".split(),
                   "George":"No Maybe No".split()},
                   index="Mon Tue Wed".split())

Now I would like to plot a heatmap and color each cell by its corresponding value. That is "Yes", "No", "Maybe", for instance becomes "Green", "Gray", "Yellow". The legend should have those three colors and the corresponding values.

I solved this problem myself in the following manner. I can't seem to pass a categorical color map to seaborn's heatmap, so instead I replace all text by numbers and reconstruct the color map used by seaborn internally afterwards i.e.:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches

# create dictionary with value to integer mappings
value_to_int = {value: i for i, value in enumerate(sorted(pd.unique(df.values.ravel())))}

f, ax = plt.subplots()
hm = sns.heatmap(df.replace(value_to_int).T, cmap="Pastel2", ax=ax, cbar=False)
# add legend
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.7, box.height])
legend_ax = f.add_axes([.7, .5, 1, .1])
legend_ax.axis('off')
# reconstruct color map
colors = plt.cm.Pastel2(np.linspace(0, 1, len(value_to_int)))
# add color map to legend
patches = [mpatches.Patch(facecolor=c, edgecolor=c) for c in colors]
legend = legend_ax.legend(patches,
    sorted(value_to_int.keys()),
    handlelength=0.8, loc='lower left')
for t in legend.get_texts():
    t.set_ha("left")

My question: is there a more succinct way of making this heatmap? If not, this might be a feature worth implementing in which case I'll post it on the seaborn issue tracker.


回答1:


You can use a discrete colormap and modify the colorbar, instead of using a legend.

value_to_int = {j:i for i,j in enumerate(pd.unique(df.values.ravel()))} # like you did
n = len(value_to_int)     
# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("Pastel2", n) 
ax = sns.heatmap(df.replace(value_to_int), cmap=cmap) 
# modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(list(value_to_int.keys()))                                          
plt.show()

The colorbar part is adapted from this answer

HTH




回答2:


I would probably use bokeh for this purpose as it has categorical heatmaps built in. Y-axis labels are written horizontally too, which is more readable.

http://docs.bokeh.org/en/0.11.1/docs/gallery/heatmap_chart.html



来源:https://stackoverflow.com/questions/36227475/heatmap-like-plot-but-for-categorical-variables-in-seaborn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!