How to Colour the Labels of a Dendrogram by an Additional Factor Variable in R

How to colour the labels of a dendrogram by an additional factor variable in R

Try

# ... your code
colLab <- function(n) {
if(is.leaf(n)) {
a <- attributes(n)
attr(n, "label") <- labs[a$label]
attr(n, "nodePar") <- c(a$nodePar, lab.col = varCol[a$label])
}
n
}
plot(dendrapply(hcd, colLab))

(via)

How to color a dendrogram's labels according to defined groups? (in R)

I suspect the function you are looking for is either color_labels or get_leaves_branches_col. The first color your labels based on cutree (like color_branches do) and the second allows you to get the colors of the branch of each leaf, and then use it to color the labels of the tree (if you use unusual methods for coloring the branches (as happens when using branches_attr_by_labels). For example:

# define dendrogram object to play with:
hc <- hclust(dist(USArrests[1:5,]), "ave")
dend <- as.dendrogram(hc)

library(dendextend)
par(mfrow = c(1,2), mar = c(5,2,1,0))
dend <- dend %>%
color_branches(k = 3) %>%
set("branches_lwd", c(2,1,2)) %>%
set("branches_lty", c(1,2,1))

plot(dend)

dend <- color_labels(dend, k = 3)
# The same as:
# labels_colors(dend) <- get_leaves_branches_col(dend)
plot(dend)

Sample Image

Either way, you should always have a look at the set function, for ideas on what can be done to your dendrogram (this saves the hassle of remembering all the different functions names).

How do I add string variables to a dendrogram with labels coloured by factor level?

You need to update the labels just before plotting. For example using labels(dend) <- small_iris[,5][order.dendrogram(dend)]

Full code and output:

# install.packages("dendextend")
library(dendextend)

small_iris <- iris[c(1, 51, 101, 2, 52, 102), ]
dend <- as.dendrogram(hclust(dist(small_iris[,-5])))
# Like:
# dend <- small_iris[,-5] %>% dist %>% hclust %>% as.dendrogram

# By default, the dend has no colors to the labels
labels_colors(dend)
par(mfrow = c(1,2))
plot(dend, main = "Original dend")

# let's add some color:
colors_to_use <- as.numeric(small_iris[,5])
colors_to_use
# But sort them based on their order in dend:
colors_to_use <- colors_to_use[order.dendrogram(dend)]
colors_to_use
# Now we can use them
labels_colors(dend) <- colors_to_use
# Now each state has a color
labels_colors(dend)

### UPDATE <--------------------------------
labels(dend) <- small_iris[,5][order.dendrogram(dend)]

plot(dend, main = "A color for every Species")

Sample Image

Dendextend: Regarding how to color a dendrogram’s labels according to defined groups

I'm glad you solved this on your own.
The simpler solution is to use the order_value = TRUE argument in the set function. For example:

library(dendextend)
iris2 <- iris[,-5]
rownames(iris2) <- paste(iris[,5],iris[,5],iris[,5], rownames(iris2))
dend <- iris2 %>% dist %>% hclust %>% as.dendrogram
dend <- dend %>% set("labels_colors", as.numeric(iris[,5]), order_value = TRUE) %>%
set("labels_cex", .5)
par(mar = c(4,1,0,8))
plot(dend, horiz = T)

Will result in (as you can see, the colors of the labels is based on the other variable "Species" in the iris dataset):

Sample Image

(p.s.: I tripled the number of times a species appears in order to make it easier to see how the color relates to the length of the label)

R: Color branches of dendrogram while preserving the color legend

Here is an example on how to achieve the desired coloring:

library(tidyverse)
library(ggdendro)
library(dendextend)

some data:

matrix(rnorm(1000), ncol = 10) %>%
scale %>%
dist %>%
hclust %>%
as.dendrogram() -> dend_expr

tree_labels<- dendro_data(dend_expr, type = "rectangle")
tree_labels$labels <- cbind(tree_labels$labels, Diagnosis = as.factor(sample(1:2, 100, replace = T)))

Plot:

ggplot() +
geom_segment(data = segment(tree_labels), aes(x=x, y=y, xend=xend, yend=yend))+
geom_segment(data = tree_labels$segments %>%
filter(yend == 0) %>%
left_join(tree_labels$labels, by = "x"), aes(x=x, y=y.x, xend=xend, yend=yend, color = Diagnosis)) +
geom_text(data = label(tree_labels), aes(x=x, y=y, label=label, colour = Diagnosis, hjust=0), size=3) +
coord_flip() +
scale_y_reverse(expand=c(0.2, 0)) +
scale_colour_brewer(palette = "Dark2") +
theme_dendro() +
ggtitle("Mayo Cohort: Hierarchical Clustering of Patients Colored by Diagnosis")

Sample Image

The key is in the second geom_segment call where I do:

tree_labels$segments %>%
filter(yend == 0) %>%
left_join(tree_labels$labels, by = "x")

Filter all the leaves yend == 0 and left join tree_labels$labels by x

Color branches of dendrogram using an existing column

If you want to color the branches of a dendrogram based on a certain variable then the following code (largely taken from the help for the dendrapply function) should give the desired result:

x<-1:100
dim(x)<-c(10,10)
groups<-sample(c("red","blue"), 10, replace=TRUE)

x.clust<-as.dendrogram(hclust(dist(x)))

local({
colLab <<- function(n) {
if(is.leaf(n)) {
a <- attributes(n)
i <<- i+1
attr(n, "edgePar") <-
c(a$nodePar, list(col = mycols[i], lab.font= i%%3))
}
n
}
mycols <- groups
i <- 0
})

x.clust.dend <- dendrapply(x.clust, colLab)
plot(x.clust.dend)

How to change dendrogram labels in r

In the hclust object you've created, cl, you have an element named "order" that contains the order in which the elements are in the dendrogram.

If you want to change the labels, you need to put the new labels in the same order (cl$order), so the "new" dendrogram is right:

df$column2[cl$order]


Related Topics



Leave a reply



Submit