Sorting of categorical variables in ggplot
I'm pretty sure stat_sort
does not exist, so it's not surprising that it doesn't work as you think it should. Luckily, there's the reorder()
function which reorders the level of a categorical variable depending on the values of a second variable. I think this should do what you want:
trial.plot <- qplot( x = numbers, y = reorder(letters, numbers), data = trial)
trial.plot
sort columns with categorical variables by numerical varables in stacked barplot
The issue here is that all your percentages for a given category (name
) in fact add up to 100%. So sorting by percentage, which is normally achieved via aes(x = reorder(name, percentage), y = percentage)
, won’t work here.
Instead, you probably want to order by the percentage of the data that has class = 1 (or class = -1). Doing this requires some trickery: Use ifelse
to select the percentage for the rows where class == 1
. For all other rows, select the value 0:
ggplot(df, aes(x = reorder(name, ifelse(class == 1, percentage, 0)), y = percentage, fill = factor(class))) +
geom_bar(stat = "identity") +
scale_fill_discrete(name = "Class") +
xlab('Names')
You might want to execute just the reorder
instruction to see what’s going on:
reorder(df$name, ifelse(df$class == 1, df$percentage, 0))
# [1] A A B B C C D D
# attr(,"scores")
# A B C D
# 44.055 48.720 47.020 46.630
# Levels: A D C B
As you can see, your names got reordered based on the mean percentage for each category (by default, reorder
uses the mean; see its manual page for more details). But the “mean” we calculated was between each name’s percentage for class = 1, and the value 0 (for class ≠ 1).
R - ggplot2 re-order of categorical variable (issue with reorder func)
Relevel the variable using sorted levels of the 'value'variable (I created a new one for comparison purposes):
data$value2 <- factor(data$value, levels = sort(levels(data$value)))
ggplot(data=data, aes(x= value2, y=mean, fill = value2 )) +
geom_bar(stat="identity") + theme_minimal()
How to reorder categorical variables in x axis in ggplot2?
We could use fct_relevel
from forcats
package, it is in tidyverse
:
library(tidyverse)
S1_sub %>% filter(season == "season1", variable == "rainfall") %>%
filter(str_detect(source, 'Mada|IDW2')) %>%
mutate(aggregation_period = fct_relevel(aggregation_period, c("Mar", "Apr", "May", "Jun", "Jul", "AMJ", "MAMJJ"))) %>%
ggplot(aes(x = aggregation_period , y = value)) +
geom_boxplot(width = 0.3, col = "black", ) +
facet_grid(cols = vars(source), rows = vars(season) ,scales = "free_y") +
scale_y_continuous(breaks = seq(0, 1350, 175)) +
xlab("Rainfall Indices") +
ylab("Rainfall (mm)") +
theme_bw() +
theme(axis.title = element_text(size = 12), # all titles
axis.text = element_text(colour = "black"),
axis.text.x = element_text(angle = 90, hjust = 1,
size = 10, color = "black"),
# axis.text.y = element_text(size = 10),
panel.border = element_rect(color = "black",
size = .5))
ggplot reorders categorical variables
sort(xaxis)
[1] "100" "80" "90"
Sorting of character vectors is done by a character by character basis - ie it doesn't understand the numerical context of the data.
ggplot2
will convert character variables to factors and by default factors sort their levels:
factor(xaxis)
[1] 80 90 100
Levels: 100 80 90
Order categorical data in a stacked bar plot with ggplot2
I see that you have an order
column in your data frame which I gather is your order. Hence you can simply do.
p0 = qplot(factor(kclust), fill = reorder(hhDomMil, order), position = 'fill',
data = df1)
Here are the elements of this code that take care of your questions
- How do I plot such a ordered plot?
reorder
- How do I set up x so that each bar is "on" one number?
factor(kclust)
- How do I seperate the bars?
- How do I print all kclust numbers in x?
factor(kclust)
I remember from a previous question of yours that the hhDomMil
corresponded to different groups, and I suspect your ordering follows the grouping. In that case, you might want to use that information to choose a color palette that makes it simpler to follow the graph. Here is one way to do it.
mycols = c(brewer.pal(3, 'Oranges'), brewer.pal(3, 'Greens'),
brewer.pal(2, 'Blues'), brewer.pal(2, 'PuRd'))
p0 + scale_fill_manual(values = mycols)
reorder x-axis variables by sorting a subset of the data
df$name2 <- factor(df$name, levels = xvals)
ggplot(df, aes(x = name2, y = percent, fill = cat)) +
geom_bar(stat = "identity", position = "fill")
Data frame variable order for ggplot
First off, you can set the order of factor levels for columns like region
in your original dataframe. Then you don't end up with all these different slightly modified versions of the same data. Then sort the dataframe how you want it, and use forcats::fct_inorder
to reassign the factor levels for my_name
based on their current order in the dataframe:
library(tidyverse)
library(ggplot2)
library(forcats)
set.seed(1)
num_rows <- 12
sample_names <- do.call(paste0, replicate(5, sample(letters, num_rows, TRUE), FALSE))
df1 <- data.frame(region=sample(c("N", "S", "E", "W"), num_rows, replace = TRUE),
sub_region=sample(c("High", "Medium", "Low"), num_rows, replace = TRUE),
my_order = seq(1,num_rows),
my_name = sample_names,
var_1 = sample(100, num_rows, replace = TRUE))
df1$region <- factor(df1$region, levels = c("N","E","S","W"))
df1$sub_region <- factor(df1$sub_region, levels = c("High","Medium","Low"))
df1 <- df1[order(df1$region, df1$sub_region, df1$my_order, decreasing = TRUE), ]
# Order my_name levels based on current order
df1$my_name = fct_inorder(df1$my_name)
df1 %>% ggplot() + geom_point(aes( x = var_1, y = my_name, color=sub_region))
Note that I had to use decreasing = TRUE
in the order()
call to get the order going top to bottom.
For categorical variables like my_name
, it's the order of factor levels that determines the order ggplot
plots them in, not their current order in the dataframe which is what you were changing in your example code. This makes the tools in forcats
very useful when you need to control the order in a plot.
Related Topics
How to Better Create Stacked Bar Graphs with Multiple Variables from Ggplot2
Fast Way of Getting Index of Match in List
Multiple Condition If-Else Using Dplyr, Custom Function, or Purrr
Drawing Simple Mediation Diagram in R
R - Waiting for Page to Load in Rselenium with Phantomjs
R - File.Choose() Customizing Dialogue Window
How Can a Script Find Itself in R Running from the Command Line
Find the Index of the Column in Data Frame That Contains the String as Value
R Markdown Math Equation Alignment
How to Sweep Specific Columns with Dplyr
Change Paper Size and Orientation in an Rmarkdown PDF
Joining Two Datasets Using Fuzzy Logic
Igraph Axes Xlim Ylim Plot Incorrectly
Grid.Arrange Using List of Plots
R: How to Create a Vector of Functions
Returning a Vector of Class Posixct with Vapply
Chain Arithmetic Operators in Dplyr with %>% Pipe
Difference Between Installing a Package from Source and from Compiled Binary