Order Categorical Data in a Stacked Bar Plot with Ggplot2

Order categorical data in a stacked bar plot with ggplot2

I see that you have an order column in your data frame which I gather is your order. Hence you can simply do.

p0 = qplot(factor(kclust), fill = reorder(hhDomMil, order), position = 'fill', 
data = df1)

Here are the elements of this code that take care of your questions

  1. How do I plot such a ordered plot? reorder
  2. How do I set up x so that each bar is "on" one number? factor(kclust)
  3. How do I seperate the bars?
  4. How do I print all kclust numbers in x? factor(kclust)

I remember from a previous question of yours that the hhDomMil corresponded to different groups, and I suspect your ordering follows the grouping. In that case, you might want to use that information to choose a color palette that makes it simpler to follow the graph. Here is one way to do it.

mycols = c(brewer.pal(3, 'Oranges'), brewer.pal(3, 'Greens'), 
brewer.pal(2, 'Blues'), brewer.pal(2, 'PuRd'))

p0 + scale_fill_manual(values = mycols)

Sample Image

ggplot2 order categorical stacked bars by proportions of y-axis

Use the group aesthetic to control the order of the stacked bar.

s <- "Sequence Abundance Category
CAGTG 0.8 A
CAGTG 0.2 B
CAGTG 0.6 C
CAGTG 0.3 D
CAGTG 0.1 E
GGGAC 0.1 A
GGGAC 0.1 B
GGGAC 0.3 C
GGGAC 0.6 D
GGGAC 0.1 E
CTTGA 0.1 A
CTTGA 0.7 B
CTTGA 0.1 C
CTTGA 0.1 D
CTTGA 0.8 E"
d <- read.delim(textConnection(s), header=T, sep=" ")

# Add the "group" aesthetic to control the order of the stacked bars
g = ggplot(d,aes(x=Category, y=Abundance, fill=Sequence, group=Abundance)) +
geom_bar(position = "fill",stat = "identity")
g

Sample Image

Order the stacked barplot by proportion or percent in R

Add the proportion c by group like this, when generating p:

... %>%
group_by(Name) %>%
mutate(prop_c = sum(value[variable=="c"], na.rm=T)/sum(value, na.rm=T))

Then plot, using reorder:

ggplot() +
geom_col(data= p,aes(x = reorder(Name,prop_c, decreasing=T),y = value,fill = variable),position = "fill") +
geom_text(data = long_totals, aes(y = 100,x = Name,label = Total),size = 7,position = position_fill(vjust = 1.02)) +
scale_y_continuous(labels = scales::percent_format())

reorder_c

sort columns with categorical variables by numerical varables in stacked barplot

The issue here is that all your percentages for a given category (name) in fact add up to 100%. So sorting by percentage, which is normally achieved via aes(x = reorder(name, percentage), y = percentage), won’t work here.

Instead, you probably want to order by the percentage of the data that has class = 1 (or class = -1). Doing this requires some trickery: Use ifelse to select the percentage for the rows where class == 1. For all other rows, select the value 0:

ggplot(df, aes(x = reorder(name, ifelse(class == 1, percentage, 0)), y = percentage, fill = factor(class))) +
geom_bar(stat = "identity") +
scale_fill_discrete(name = "Class") +
xlab('Names')

You might want to execute just the reorder instruction to see what’s going on:

reorder(df$name, ifelse(df$class == 1, df$percentage, 0))
# [1] A A B B C C D D
# attr(,"scores")
# A B C D
# 44.055 48.720 47.020 46.630
# Levels: A D C B

As you can see, your names got reordered based on the mean percentage for each category (by default, reorder uses the mean; see its manual page for more details). But the “mean” we calculated was between each name’s percentage for class = 1, and the value 0 (for class ≠ 1).

Order Stacked Bar Graph by sum / total of all subgroups

The general (non ggplot-specific) answer is to use reorder() to reset the factor levels in a categorical column, based on some function of the other columns.

## Examine the default factor order
levels(samp.data$fullname)

## Reorder fullname based on the the sum of the other columns
samp.data$fullname <- reorder(samp.data$fullname, rowSums(samp.data[-1]))

## Examine the new factor order
levels(samp.data$fullname)
attributes(samp.data$fullname)

Then just replot, using code from the original question

md <- melt(samp.data, id=(c("fullname")))
temp.plot<-ggplot(data=md, aes(x=fullname, y=value, fill=variable) ) +
geom_bar()+
theme(axis.text.x=theme_text(angle=90)) +
labs(title = "Score Distribtion")
## ggsave(temp.plot,filename="test.png")

Sample Image

how to change the stacked bar chart using ggplot2 (percentage, sort) in R

library(tidyverse)

data(tips, package='reshape2')

tips %>%
#Calculating percentage by sex outside ggplot2
count(sex,day) %>%
group_by(sex) %>%
mutate(p = 100*n/sum(n)) %>%
ungroup() %>%
#Removing Friday
filter(day != "Fri") %>%
#Ordering sex by Saturday percentage
mutate(
sex = fct_reorder2(
.f = sex,
.x = p,
.y = day,
.fun = function(x,y) max(x[y == "Sat"])
)
) %>%
ggplot(aes(x = sex, y = p)) +
geom_col(aes(fill=day))

How to include a bar for total in a stacked bar chart for categorical variable in R

You could achieve your desired result by adding a second geom_bar where you map a constant (e.g. the string Total) on x and setting the order of categoreis via scale_x_discrete:

library(ggplot2)

ggplot(df, aes(x=City, fill=Type), reorder(Type)) +
geom_bar(position="fill", alpha=0.75) +
geom_bar(aes(x = "Total"), position="fill", alpha=0.75) +
scale_fill_brewer(palette = "Paired") +
scale_x_discrete(limits = c("Total", unique(df$City))) +
coord_flip()

Sample Image

Order Bars in ggplot2 bar graph

The key with ordering is to set the levels of the factor in the order you want. An ordered factor is not required; the extra information in an ordered factor isn't necessary and if these data are being used in any statistical model, the wrong parametrisation might result — polynomial contrasts aren't right for nominal data such as this.

## set the levels in order we want
theTable <- within(theTable,
Position <- factor(Position,
levels=names(sort(table(Position),
decreasing=TRUE))))
## plot
ggplot(theTable,aes(x=Position))+geom_bar(binwidth=1)

barplot figure

In the most general sense, we simply need to set the factor levels to be in the desired order. If left unspecified, the levels of a factor will be sorted alphabetically. You can also specify the level order within the call to factor as above, and other ways are possible as well.

theTable$Position <- factor(theTable$Position, levels = c(...))

GGPLOT2: Stacked bar plot for two discrete variable columns

Your problem here is that you haven't fixed your tibble from Wide to Long.

FixedData <- sampleData %>%
pivot_longer(cols = c("var_1", "var_2"), names_prefix = "var_",
names_to = "Variable Number", values_to = "ValueName")

Once you do this, the problem becomes much easier to solve. You only need to change a few things, most notably the y, fill, and position variables to make it work.

p2 <- ggplot(FixedData, aes(x = grp, y = ValueName, fill = `Variable Number`)) +
geom_bar(stat="identity", position = "stack")+
coord_flip()+ theme_bw()

p2


Related Topics



Leave a reply



Submit