Group variables by clusters on heatmap in R

為{幸葍}努か 提交于 2019-12-24 01:33:33

问题


I am trying to reproduce the first figure of this paper on graph clustering:

Here is a sample of my adjacency matrix:

data=cbind(c(48,0,0,0,0,1,3,0,1,0),c(0,75,0,0,3,2,1,0,0,1),c(0,0,34,1,16,0,3,0,1,1),c(0,0,1,58,0,1,3,1,0,0),c(0,3,16,0,181,6,6,0,2,2),c(1,2,0,1,6,56,2,1,0,1),c(3,1,3,3,6,2,129,0,0,1),c(0,0,0,1,0,1,0,13,0,1),c(1,0,1,0,2,0,0,0,70,0),c(0,1,1,0,2,1,1,1,0,85))
colnames(data)=letters[1:nrow(data)]
rownames(data)=colnames(data)

And with these commands I obtain the following heatmap:

library(reshape)
library(ggplot2)
data.m=melt(data)
data.m[,"rescale"]=round(rescale(data.m[,"value"]),3)
p=ggplot(data.m,aes(X1, X2))+geom_tile(aes(fill=rescale),colour="white") 
p=p+scale_fill_gradient(low="white",high="black")
p+theme(text=element_text(size=10),axis.text.x=element_text(angle=90,vjust=0)) 

This is very similar to the plot on the left of Figure 1 above. The only differences are that (1) the nodes are not ordered randomly but alphabetically, and (2) instead of just having binary black/white pixels, I am using a "shades of grey" palette to be able to show the strength of the co-occurrence between nodes.

But the point is that it is very hard to distinguish any cluster structure (and this would be even more true with the full set of 100 nodes). So, I want to order my vertices by clusters on the heatmap. I have this membership vector from a community detection algorithm:

membership=c(1,2,4,2,5,3,1,2,2,3)

Now, how can I obtain a heatmap similar to the plot on the right of Figure 1 above?

Thanks a lot in advance for any help

PS: I have experimented R draw kmeans clustering with heatmap and R: How do I display clustered matrix heatmap (similar color patterns are grouped) but could not get what I want.


回答1:


Turned out this was extremely easy. I am still posting the solution so others in my case don't waste time on that like I did.

The first part is exactly the same as before:

data.m=melt(data)
data.m[,"rescale"]=round(rescale(data.m[,"value"]),3)

Now, the trick is that the levels of the factors of the melted data.frame have to be ordered by membership:

data.m[,"X1"]=factor(data.m[,"X1"],levels=levels(data.m[,"X1"])[order(membership)])
data.m[,"X2"]=factor(data.m[,"X2"],levels=levels(data.m[,"X2"])[order(membership)])

Then, plot the heat map (same as before):

p=ggplot(data.m,aes(X1, X2))+geom_tile(aes(fill=rescale),colour="white") 
p=p+scale_fill_gradient(low="white",high="black")
p+theme(text=element_text(size=10),axis.text.x=element_text(angle=90,vjust=0))

This time, the cluster is clearly visible.



来源:https://stackoverflow.com/questions/29085940/group-variables-by-clusters-on-heatmap-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!