How to color a dendrogram's labels according to defined groups? (in R)

ぃ、小莉子 提交于 2019-12-01 06:29:39

I suspect the function you are looking for is either color_labels or get_leaves_branches_col. The first color your labels based on cutree (like color_branches do) and the second allows you to get the colors of the branch of each leaf, and then use it to color the labels of the tree (if you use unusual methods for coloring the branches (as happens when using branches_attr_by_labels). For example:

# define dendrogram object to play with:
hc <- hclust(dist(USArrests[1:5,]), "ave")
dend <- as.dendrogram(hc)

library(dendextend)
par(mfrow = c(1,2), mar = c(5,2,1,0))
dend <- dend %>%
         color_branches(k = 3) %>%
         set("branches_lwd", c(2,1,2)) %>%
         set("branches_lty", c(1,2,1))

plot(dend)

dend <- color_labels(dend, k = 3)
# The same as:
# labels_colors(dend)  <- get_leaves_branches_col(dend)
plot(dend)

Either way, you should always have a look at the set function, for ideas on what can be done to your dendrogram (this saves the hassle of remembering all the different functions names).

You may take a look at this tutorial, which displays several solutions for visualizing dendograms in R by groups

https://rstudio-pubs-static.s3.amazonaws.com/1876_df0bf890dd54461f98719b461d987c3d.html

However, I think the best solution, suit for your data, is offered by the package 'dendextend'. See the tutorial (the example concerning the 'iris' dataset, which is similar to your problem): https://nycdatascience.com/wp-content/uploads/2013/09/dendextend-tutorial.pdf

See also the vignette: http://cran.r-project.org/web/packages/dendextend/vignettes/Cluster_Analysis.html

user3875022

You may try this solution, only change 'labs' with your 'MS.groups' and 'var' with your 'MS.groups' converted to numeric (maybe, with as.numeric). It comes from How to colour the labels of a dendrogram by an additional factor variable in R

## The data
df <- structure(list(labs = c("a1", "a2", "a3", "a4", "a5", "a6", "a7", 
"a8", "b1", "b2", "b3", "b4", "b5", "b6", "b7"), var = c(1L, 1L, 2L,     
1L,2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L), td = c(13.1, 14.5, 16.7, 
12.9, 14.9, 15.6, 13.4, 15.3, 12.8, 14.5, 14.7, 13.1, 14.9, 15.6, 14.6), 
fd = c(2L, 3L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 4L, 2L, 1L, 4L, 3L, 3L)), 
.Names = c("labs", "var", "td", "fd"), class = "data.frame", row.names = 
c(NA, -15L))

## Subset for clustering
df.nw = df[,3:4]

# Assign the labs column to a vector
labs = df$labs

d = dist(as.matrix(df.nw))                          # find distance matrix 
hc = hclust(d, method="complete")                   # apply hierarchical clustering 

## plot the dendrogram

plot(hc, hang=-0.01, cex=0.6, labels=labs, xlab="") 

## convert hclust to dendrogram 
hcd = as.dendrogram(hc)                             

## plot using dendrogram object
plot(hcd, cex=0.6)                                  

Var = df$var                                        # factor variable for colours
varCol = gsub("1","red",Var)                        # convert numbers to colours
varCol = gsub("2","blue",varCol)

# colour-code dendrogram branches by a factor 

# ... your code
colLab <- function(n) {
  if(is.leaf(n)) {
    a <- attributes(n)
    attr(n, "label") <- labs[a$label]
    attr(n, "nodePar") <- c(a$nodePar, lab.col = varCol[a$label]) 
  }
  n
}

## Coloured plot
plot(dendrapply(hcd, colLab))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!