Order categorical data in a stacked bar plot with ggplot2
I see that you have an order
column in your data frame which I gather is your order. Hence you can simply do.
p0 = qplot(factor(kclust), fill = reorder(hhDomMil, order), position = 'fill',
data = df1)
Here are the elements of this code that take care of your questions
- How do I plot such a ordered plot?
reorder
- How do I set up x so that each bar is "on" one number?
factor(kclust)
- How do I seperate the bars?
- How do I print all kclust numbers in x?
factor(kclust)
I remember from a previous question of yours that the hhDomMil
corresponded to different groups, and I suspect your ordering follows the grouping. In that case, you might want to use that information to choose a color palette that makes it simpler to follow the graph. Here is one way to do it.
mycols = c(brewer.pal(3, 'Oranges'), brewer.pal(3, 'Greens'),
brewer.pal(2, 'Blues'), brewer.pal(2, 'PuRd'))
p0 + scale_fill_manual(values = mycols)
ggplot2 order categorical stacked bars by proportions of y-axis
Use the group
aesthetic to control the order of the stacked bar.
s <- "Sequence Abundance Category
CAGTG 0.8 A
CAGTG 0.2 B
CAGTG 0.6 C
CAGTG 0.3 D
CAGTG 0.1 E
GGGAC 0.1 A
GGGAC 0.1 B
GGGAC 0.3 C
GGGAC 0.6 D
GGGAC 0.1 E
CTTGA 0.1 A
CTTGA 0.7 B
CTTGA 0.1 C
CTTGA 0.1 D
CTTGA 0.8 E"
d <- read.delim(textConnection(s), header=T, sep=" ")
# Add the "group" aesthetic to control the order of the stacked bars
g = ggplot(d,aes(x=Category, y=Abundance, fill=Sequence, group=Abundance)) +
geom_bar(position = "fill",stat = "identity")
g
Order the stacked barplot by proportion or percent in R
Add the proportion c by group like this, when generating p
:
... %>%
group_by(Name) %>%
mutate(prop_c = sum(value[variable=="c"], na.rm=T)/sum(value, na.rm=T))
Then plot, using reorder
:
ggplot() +
geom_col(data= p,aes(x = reorder(Name,prop_c, decreasing=T),y = value,fill = variable),position = "fill") +
geom_text(data = long_totals, aes(y = 100,x = Name,label = Total),size = 7,position = position_fill(vjust = 1.02)) +
scale_y_continuous(labels = scales::percent_format())
sort columns with categorical variables by numerical varables in stacked barplot
The issue here is that all your percentages for a given category (name
) in fact add up to 100%. So sorting by percentage, which is normally achieved via aes(x = reorder(name, percentage), y = percentage)
, won’t work here.
Instead, you probably want to order by the percentage of the data that has class = 1 (or class = -1). Doing this requires some trickery: Use ifelse
to select the percentage for the rows where class == 1
. For all other rows, select the value 0:
ggplot(df, aes(x = reorder(name, ifelse(class == 1, percentage, 0)), y = percentage, fill = factor(class))) +
geom_bar(stat = "identity") +
scale_fill_discrete(name = "Class") +
xlab('Names')
You might want to execute just the reorder
instruction to see what’s going on:
reorder(df$name, ifelse(df$class == 1, df$percentage, 0))
# [1] A A B B C C D D
# attr(,"scores")
# A B C D
# 44.055 48.720 47.020 46.630
# Levels: A D C B
As you can see, your names got reordered based on the mean percentage for each category (by default, reorder
uses the mean; see its manual page for more details). But the “mean” we calculated was between each name’s percentage for class = 1, and the value 0 (for class ≠ 1).
Order Stacked Bar Graph by sum / total of all subgroups
The general (non ggplot
-specific) answer is to use reorder()
to reset the factor levels in a categorical column, based on some function of the other columns.
## Examine the default factor order
levels(samp.data$fullname)
## Reorder fullname based on the the sum of the other columns
samp.data$fullname <- reorder(samp.data$fullname, rowSums(samp.data[-1]))
## Examine the new factor order
levels(samp.data$fullname)
attributes(samp.data$fullname)
Then just replot, using code from the original question
md <- melt(samp.data, id=(c("fullname")))
temp.plot<-ggplot(data=md, aes(x=fullname, y=value, fill=variable) ) +
geom_bar()+
theme(axis.text.x=theme_text(angle=90)) +
labs(title = "Score Distribtion")
## ggsave(temp.plot,filename="test.png")
how to change the stacked bar chart using ggplot2 (percentage, sort) in R
library(tidyverse)
data(tips, package='reshape2')
tips %>%
#Calculating percentage by sex outside ggplot2
count(sex,day) %>%
group_by(sex) %>%
mutate(p = 100*n/sum(n)) %>%
ungroup() %>%
#Removing Friday
filter(day != "Fri") %>%
#Ordering sex by Saturday percentage
mutate(
sex = fct_reorder2(
.f = sex,
.x = p,
.y = day,
.fun = function(x,y) max(x[y == "Sat"])
)
) %>%
ggplot(aes(x = sex, y = p)) +
geom_col(aes(fill=day))
How to include a bar for total in a stacked bar chart for categorical variable in R
You could achieve your desired result by adding a second geom_bar
where you map a constant (e.g. the string Total
) on x
and setting the order of categoreis via scale_x_discrete
:
library(ggplot2)
ggplot(df, aes(x=City, fill=Type), reorder(Type)) +
geom_bar(position="fill", alpha=0.75) +
geom_bar(aes(x = "Total"), position="fill", alpha=0.75) +
scale_fill_brewer(palette = "Paired") +
scale_x_discrete(limits = c("Total", unique(df$City))) +
coord_flip()
Order Bars in ggplot2 bar graph
The key with ordering is to set the levels of the factor in the order you want. An ordered factor is not required; the extra information in an ordered factor isn't necessary and if these data are being used in any statistical model, the wrong parametrisation might result — polynomial contrasts aren't right for nominal data such as this.
## set the levels in order we want
theTable <- within(theTable,
Position <- factor(Position,
levels=names(sort(table(Position),
decreasing=TRUE))))
## plot
ggplot(theTable,aes(x=Position))+geom_bar(binwidth=1)
In the most general sense, we simply need to set the factor levels to be in the desired order. If left unspecified, the levels of a factor will be sorted alphabetically. You can also specify the level order within the call to factor as above, and other ways are possible as well.
theTable$Position <- factor(theTable$Position, levels = c(...))
GGPLOT2: Stacked bar plot for two discrete variable columns
Your problem here is that you haven't fixed your tibble from Wide to Long.
FixedData <- sampleData %>%
pivot_longer(cols = c("var_1", "var_2"), names_prefix = "var_",
names_to = "Variable Number", values_to = "ValueName")
Once you do this, the problem becomes much easier to solve. You only need to change a few things, most notably the y
, fill
, and position
variables to make it work.
p2 <- ggplot(FixedData, aes(x = grp, y = ValueName, fill = `Variable Number`)) +
geom_bar(stat="identity", position = "stack")+
coord_flip()+ theme_bw()
p2
Related Topics
Displaying Image on Point Hover in Plotly
Efficient Multiplication of Columns in a Data Frame
2 Knitr/R Markdown/Rstudio Issues: Highcharts and Morris.Js
Check to See If a Value Is Within a Range
Installing Rcppeigen on Amazon Ec2
How to Calculate the Distance Between Latitude and Longitude Along Rows of Columns in R
Getting the Error "Level Sets of Factors Are Different" When Running a for Loop
Different Colors with Gradient for Subgroups on a Treemap Ggplot2 R
Changing Class and Mode from Character to Numeric
Replace a Subset of a Data Frame with Dplyr Join Operations
Getting File Path from Shiny UI (Not Just Directory) Using Browse Button Without Uploading the File
Pivot_Wider, Count Number of Occurrences
Does Installing Blas/Atlas/Mkl/Openblas Will Speed Up R Package That Is Written in C/C++