How to Create a Marimekko/Mosaic Plot in Ggplot2

How to create a Marimekko/Mosaic plot in ggplot2

I had the same issue for a project some time back. My solution was to use geom_bar together with the scales="free_x", space="free_x" option in facet_grid to accommodate different bar widths:

# using diamonds dataset for illustration
df <- diamonds %>%
group_by(cut, clarity) %>%
summarise(count = n()) %>%
mutate(cut.count = sum(count),
prop = count/sum(count)) %>%
ungroup()

ggplot(df,
aes(x = cut, y = prop, width = cut.count, fill = clarity)) +
geom_bar(stat = "identity", position = "fill", colour = "black") +
# geom_text(aes(label = scales::percent(prop)), position = position_stack(vjust = 0.5)) + # if labels are desired
facet_grid(~cut, scales = "free_x", space = "free_x") +
scale_fill_brewer(palette = "RdYlGn") +
# theme(panel.spacing.x = unit(0, "npc")) + # if no spacing preferred between bars
theme_void()

marimekko plot

Creating a mosaic plot with percentages

Using only one package, you can do and note I am labeling the cells with the proportions in each class (i.e rows sum up to 1):

library(vcd)
M = as.table(as.matrix(df[,-1]))
names(dimnames(M)) = c("Class","result")
labs <- round(prop.table(M,margin=1), 2)
mosaic(M, pop = FALSE)
labeling_cells(text = labs, margin = 0)(M)

Sample Image

You can also just visualize it with a simple

library(RColorBrewer)
barplot(t(labs),col=brewer.pal(4,"Set2"))

legend("bottomright",legend = colnames(labs),inset=c(0,1.1), xpd=TRUE,
fill =brewer.pal(4,"Set2"),horiz=TRUE,cex=0.7)

Sample Image

If you use ggplot2 and another other gg stuff, you need to pivot your data long:

library(tidyr)
library(dplyr)
library(ggplot2)

df_long = df %>%
pivot_longer(-Class) %>%
group_by(Class) %>%
mutate(total = sum(value),
p = round(100*value/total,digits=1)) %>%
ungroup()

ggplot(df_long,aes(x=Class,y=p,fill=name)) + geom_col() + geom_text(aes(label=p),position=position_stack(vjust=0.2))

Sample Image

If you want to use ggplot2, you need to modify this answer by z.lin, note I take the sqrt to make the smaller plots more visible:

ggplot(df_long,
aes(x = Class, y = p, width = sqrt(total), fill = name)) +
geom_col(colour = "black") +
geom_text(aes(label = p), position = position_stack(vjust = 0.5)) +
facet_grid(~Class, scales = "free_x", space = "free_x") +
theme_void()

Sample Image

Creating a Function for a Mosaic Plot with Ggmosaic using Standard Evaluation

You could do:

Mosaic<-function(var_product="health",fill="happy"){
happy%>%
na.omit()%>%
count_(c(var_product,fill))%>%
ggplot(aes(weight=n))+
geom_mosaic(aes_string(x=paste0("product(", var_product, ")"),fill=fill))
}

Example:

Mosaic("sex","degree")

Sample Image

stacked geom_bar(): keep equal gaps between bars with variable widths

You could use facet_grid and set the individual facets to have no space on left and right side

graphics.off()
ggplot(dt2, aes(x=group,y=rel1,fill=risk,width = grpSize/200)) +
geom_bar(stat='identity') +
scale_x_discrete(expand = c(0, 0)) +
facet_grid(~group, scales = "free", space = "free")

Sample Image

How to plot a mosaic plot from pre-calculated count data?

One possibility is to 'explode' your pre-calculated data using rep.

country <- with(df, rep(x = Country, times = Count))
name <- with(df, rep(x = Name, times = Count))

df2 <- data.frame(country, name)
mosaicplot(country ~ name, data = df2)

Sample Image

Adding counts to ggmosaic, can this be done simpler?

This can be done with a single line of code using the inbuilt labelling functionality of the ggmosaic package.

To do so we simply add the geom_mosaic_text() layer:

data <- tribble(~a, ~b, 
1, 1,
1, 1,
1, 1,
1, 2,
2, 1,
2, 2,
3, 2) %>%
mutate(across(c(a, b), as.factor))

ggplot(data) +
geom_mosaic(aes(x=product(b, a), fill=b)) +
geom_mosaic_text(aes(x = product(b, a), label = after_stat(.wt)), as.label=TRUE)

Labelled mosaic plot

ggmosaic: how to remove the thin line when the count of a factor levels is 0

Don't know whether or not it could be adjusted in ggmosaic, but it turned out this plot can be done very easily with ggplot

happy2 <- happy
happy2$marital <-
ifelse(happy2$marital == "never married" & happy2$happy == "not too happy",
NA, happy2$marital)

ggplot(happy2) +
geom_histogram(aes(x = marital, fill = happy), colour = "black",
width = 1, stat = "count", position = "fill") +
scale_y_continuous(expand = c(0,0)) +
scale_x_discrete(expand = c(0,0))


Related Topics



Leave a reply



Submit