问题
In How to draw a graphical count table in pandas I asked how to draw a heatmap from input data such as:
customer1,customer2
a,b
a,c
a,c
b,a
b,c
b,c
c,c
a,a
b,c
b,c
The answer was
x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)
idx = x.max(axis=1).sort_values(ascending=0).index
sns.heatmap(x[idx].reindex(idx), annot=True)
This gives a square matrix showing the number of counts for each pair from the two columns.
This solution doesn't work however if there items in the first column which don't appear in the second. For example:
a,b
a,c
c,b
Gives an error saying that [u,'a'] is not in the Index.
Is there a simple solution?
回答1:
Try this:
In [129]: df
Out[129]:
customer1 customer2
0 a b
1 a c
2 a c
3 b b
4 b c
5 b c
6 c c
7 a b
8 b c
9 b c
In [130]: x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)
In [131]: idx = x.max(axis=1).sort_values(ascending=0).index
In [132]: cols = x.max().sort_values(ascending=0).index
In [133]: sns.heatmap(x[cols].reindex(idx), annot=True)
Out[133]: <matplotlib.axes._subplots.AxesSubplot at 0xbb22588>
来源:https://stackoverflow.com/questions/39291261/how-to-draw-a-heatmap-in-pandas-with-items-that-dont-occur-in-both-columns