How to produce stacked bars within grouped barchart in R
Here is what I came up with, similar to a solution proposed here: stacked bars within grouped bar chart
Melt
data.frame
and add a new columncat
library(reshape2) # for melt
melted <- melt(test, "person")
melted$cat <- ''
melted[melted$variable == 'value1',]$cat <- "first"
melted[melted$variable != 'value1',]$cat <- "second"Plot a stacked chart
cat
vsvalue
, faceting byperson
. You may need to adjust the labels to get what you want:ggplot(melted, aes(x = cat, y = value, fill = variable)) +
geom_bar(stat = 'identity', position = 'stack') + facet_grid(~ person)
stacked bars within grouped bar chart
You can put cat
on the x-axis and use facet_grid
with year
:
ggplot() +
geom_bar(data=test, aes(y = value, x = cat, fill = cond), stat="identity",
position='stack') +
theme_bw() +
facet_grid( ~ year)
How to create grouped and stacked bars
- As an FYI, stacked bars are not the best option, because they can make it difficult to compare bar values and can easily be misinterpreted. The purpose of a visualization is to present data in an easily understood format; make sure the message is clear. Side-by-side bars are often a better option.
- Side-by-side stacked bars are a difficult manual process to construct, it's better to use a figure-level method like
seaborn.catplot
, which will create a single, easy to read, data visualization. - Bar plot ticks are located by 0 indexed range (not datetimes), the dates are just labels, so it is not necessary to convert them to a
datetime dtype
. - Tested in
python 3.8.11
,pandas 1.3.2
,matplotlib 3.4.3
,seaborn 0.11.2
seaborn
import seaborn as sns
sns.catplot(kind='bar', data=df, col='subsidiary', x='date', y='value', hue='business')
Create grouped and stacked bars
- See Stacked Bar Chart and Grouped bar chart with labels
- The issue with the creation of the stacked bars in the OP is
bottom
is being set on the entire dataframe for that group, instead of only the values that make up the bar height. - do I really need to create three sub-dfs per subsidiary. Yes, a
DataFrame
is needed for every group, so 6, in this case.- Creating the data subsets can be automated using a
dict-comprehension
to unpack the.groupby
object into adict
.data = {''.join(k): v for k, v in df.groupby(['subsidiary', 'business'])}
to create adict
ofDataFrames
- Access the values like:
data['EUCORP'].value
- Creating the data subsets can be automated using a
- Automating the plot creation is more arduous, as can be seen
x
depends on how many groups of bars for each tick, andbottom
depends on the values for each subsequent plot.
import numpy as np
import matplotlib.pyplot as plt
labels=df['date'].drop_duplicates() # set the dates as labels
x0 = np.arange(len(labels)) # create an array of values for the ticks that can perform arithmetic with width (w)
# create the data groups with a dict comprehension and groupby
data = {''.join(k): v for k, v in df.groupby(['subsidiary', 'business'])}
# build the plots
subs = df.subsidiary.unique()
stacks = len(subs) # how many stacks in each group for a tick location
business = df.business.unique()
# set the width
w = 0.35
# this needs to be adjusted based on the number of stacks; each location needs to be split into the proper number of locations
x1 = [x0 - w/stacks, x0 + w/stacks]
fig, ax = plt.subplots()
for x, sub in zip(x1, subs):
bottom = 0
for bus in business:
height = data[f'{sub}{bus}'].value.to_numpy()
ax.bar(x=x, height=height, width=w, bottom=bottom)
bottom += height
ax.set_xticks(x0)
_ = ax.set_xticklabels(labels)
- As you can see, small values are difficult to discern, and using
ax.set_yscale('log')
does not work as expected with stacked bars (e.g. it does not make small values more readable).
Create only stacked bars
- As mentioned by @r-beginners, use
.pivot
, or.pivot_table
, to reshape the dataframe to a wide form to create stacked bars where the x-axis is a tuple ('date'
,'subsidiary'
).- Use
.pivot
if there are no repeat values for each category - Use
.pivot_table
, if there are repeat values that must be combined withaggfunc
(e.g.'sum'
,'mean'
, etc.)
- Use
# reshape the dataframe
dfp = df.pivot(index=['date', 'subsidiary'], columns=['business'], values='value')
# plot stacked bars
dfp.plot(kind='bar', stacked=True, rot=0, figsize=(10, 4))
grouped (twice) and stacked bar chart with facet wrapping
Updated following OP's comment about wanting dodged and stacked bars: dodged by majortype
; stacked by type
.
Combining dodged and stacked bars is not a feature of ggplot: https://github.com/tidyverse/ggplot2/issues/2267
However, with help from this link: ggplot2 - bar plot with both stack and dodge and a bit of additional tinkering you could try this...
library(ggplot2)
library(dplyr)
# prepare data so that values are in effect stacked and in the right order
dat <-
mydata %>%
group_by(year, subject, student, majortype) %>%
arrange(type) %>%
mutate(val_cum = cumsum(value))
ggplot(dat, aes(fill = majortype, y = val_cum, x = year)) +
geom_col(data = filter(dat, type == "low income"), position = position_dodge2(width = 0.9), alpha = 0.5)+
geom_col(data = filter(dat, type == "high income"), position = position_dodge2(width = 0.9), alpha = 1) +
geom_tile(aes(y = NA_integer_, alpha = type)) +
scale_fill_manual(breaks = c("passed", "total"),
labels = c("High income - passed", "High income - total"),
values = c("red", "blue"))+
guides(alpha = guide_legend(override.aes = list(fill = c("red", "blue"), alpha = c(0.5, 0.5))))+
scale_alpha_manual(breaks = c("high income", "low income"),
labels = c("Low income - passed", "Low income - total"),
values = c(1, 0.5))+
facet_wrap(student~subject)+
labs(x = NULL,
y = "Number of students",
fill = NULL,
alpha = NULL)+
theme_minimal() +
theme(text = element_text(size=15),
plot.title = element_text(size=20, face="bold"),
axis.text = element_text(size=9))
Created on 2022-05-09 by the reprex package (v2.0.1)
Grouped stacked bar chart in R
You need position = "stack"
instead of "dodge"
.
I reorganised your code slightly:
library(ggplot2)
library(dplyr)
data %>%
mutate(emp_class = cut(Count,
breaks = c(0, 30, 40, 50, 60, 100),
labels = c('(0-30)', '(31-40)', '(41-50)', '(51-60)', '(61-100)')
)
) %>%
pivot_longer(c(Male, Female),
names_to = "MF") %>%
group_by(Emp_group, MF) %>%
add_count() %>%
mutate(x_axis = as.factor(paste(Emp_group, n, sep = "\n"))) %>%
ggplot(aes(x = x_axis, fill = as.factor(emp_class))) +
geom_bar(aes(y = value),
position = "fill",
stat = "identity") +
labs(x = "", y = "% Employes", fill = "Age group") +
theme(text = element_text(size = 20)) +
facet_wrap(~MF) +
scale_y_continuous(labels = scales::percent_format())
This returns
Data
structure(list(Count = c(38, 38, 33, 34, 41, 45, 35, 39, 39,
44, 42), Male = c(10, 20, 15, 17, 35, 25, 17, 30, 9, 22, 20),
Female = c(28, 18, 18, 17, 6, 20, 18, 9, 30, 22, 22), Emp_group = c("Staff",
"Teacher", "Teacher", "Teacher", "Staff", "Teacher", "Staff",
"Staff", "Teacher", "Staff", "Teacher")), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -11L), spec = structure(list(
cols = list(Count = structure(list(), class = c("collector_double",
"collector")), Male = structure(list(), class = c("collector_double",
"collector")), Female = structure(list(), class = c("collector_double",
"collector")), Emp_group = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
Combine grouped and stacked bars
I am not aware of a straightforward solution for this (read more here:
Combination of grouped and stacked bar chart ). But we can find a workaround by editing the data and modifying the axis. See below;
DT1 <- DT[variable =="example_var3", year := year + 0.4][]
my_plot <-
plot_ly(
DT1,
x = ~ year,
y = ~ value,
type = "bar",
name = ~ variable,
text = ~ ratio,
textposition = 'auto'
) %>%
layout (barmode = "stack",
xaxis = list(
ticktext = list(2019, 2020, 2021),
tickvals = lapply(list(2019, 2020, 2021), `+`, 0.2),
tickmode = "array"
))
How to plot a Stacked and grouped bar chart in ggplot?
Suppose you want to plot id
as x-axis, side by side for the month, and stack different types, you can split data frame by month, and add a bar layer for each month, shift the x
by an amount for the second month bars so they can be separated:
barwidth = 0.35
month_one <- filter(df, month == 1) %>%
group_by(id) %>% arrange(-type) %>%
mutate(pos = cumsum(count) - count / 2) # calculate the position of the label
month_two <- filter(df, month == 2) %>%
group_by(id) %>% arrange(-type) %>%
mutate(pos = cumsum(count) - count / 2)
ggplot() +
geom_bar(data = month_one,
mapping = aes(x = id, y = count, fill = as.factor(type)),
stat="identity",
position='stack',
width = barwidth) +
geom_text(data = month_one,
aes(x = id, y = pos, label = count )) +
geom_bar(data = filter(df, month==2),
mapping = aes(x = id + barwidth + 0.01, y = count, fill = as.factor(type)),
stat="identity",
position='stack' ,
width = barwidth) +
geom_text(data = month_two,
aes(x = id + barwidth + 0.01, y = pos, label = count )) +
labs(fill = "type")
gives:
dput(df)
structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), month = c(1L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), type = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), count = c(10L, 9L, 26L, 60L, 90L, 80L, 10L, 9L, 26L, 60L,
90L, 80L, 10L, 9L, 26L, 60L, 90L, 80L)), .Names = c("id", "month",
"type", "count"), class = "data.frame", row.names = c(NA, -18L
))
Related Topics
Insert a Logo in Upper Right Corner of R Markdown PDF Document
Delete a Column in a Data Frame Within a List
R Grep: Is There an and Operator
How to Do Range Grouping on a Column Using Dplyr
Replace Empty Values with Value from Other Column in a Dataframe
Insert a Blank Row After Each Group of Data
R: Split Unbalanced List in Data.Frame Column
Ggplot2 - Adding Secondary Y-Axis on Top of a Plot
Error in Installation a R Package
Can Sweave Produce Many PDFs Automatically
How to Install a Package from a Download Zip File
How to Make Grouped Layout in Igraph
Set Ggplot Plots to Have Same X-Axis Width and Same Space Between Dot Plot Rows