Standard error bars using stat_summary
Well, I can't tell you how to get a multiplier by group into stat_summary
.
However, it looks like your goal is to plot means and error bars that represent one standard error from the mean in ggplot
without summarizing the dataset before plotting.
There is a mean_se
function in ggplot2 that we can use instead of mean_cl_normal
from Hmisc. The mean_se
function has a multiplier of 1 as the default so we don't need to pass any extra arguments if we want standard error bars.
ggplot(mtcars, aes(cyl, qsec)) +
stat_summary(fun.y = mean, geom = "bar") +
stat_summary(fun.data = mean_se, geom = "errorbar")
If you want to use the mean_cl_normal
function from Hmisc
, you have to change the multiplier to 1 so you get one standard error from the mean. The mult
argument is an argument for mean_cl_normal
. Arguments that you need to pass to the summary function you are using needs to be given as a list to the fun.args
argument:
ggplot(mtcars, aes(cyl, qsec)) +
stat_summary(fun.y = mean, geom = "bar") +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", fun.args = list(mult = 1))
In pre-2.0 versions of ggplot2, the argument could be passed directly:
ggplot(mtcars, aes(cyl, qsec)) +
stat_summary(fun.y = mean, geom = "bar") +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", mult = 1)
issue with using stat_summary to produce error bars for line graphs when faceting
Here's what I think is happening: There are two rows of data per week in the unfacetted plot, but only one row per week in each panel of the facetted plot, causing the standard error calculation to return NA
. stat_summary
is intended for unsummarized data and does the data summaries internally. Use bug_subset_final
with stat_summary
, or switch to geom_errorbar
to continue using wickhami_sum
. Details below.
You've pre-summarized the data, but stat_summary
is intended to work on the raw data and calculate the summary values internally. In the summary data frame wickhami_sum
that you've passed to ggplot, there are two rows per week, one for each week of 2015 and one for each week of 2016. All of the data by week and year has been collapsed down to a single row for each week and year by the summary operation.
Thus, in the unfacetted plot, there are two rows of data for stat_summary
to operate on for each week. But in the facetted plot, it's trying to calculate a standard error from a single observation, which is probably returning NA
, hence nothing gets plotted. Even in the unfacetted plot, your error bars are being calculated from the two mean values for each year, which isn't what you want either.
Instead, either continue to use wickhami_sum
, but instead of stat_summary
do:
geom_errorbar(aes(ymin = wickhami - se, ymax=wickhami + se))
Or, use the raw data (which looks like it's called bug_subset_final
) with stat_summary
:
ggplot(bug_subset_final, aes(x=week, y=wickhami)) +
stat_summary(fun.data=mean_se, geom="errorbar)`.
R: Show % differences between values: how to calculate error bars?
First of all, you can get your original plot using stat_summary()
more easily because it will calculate the mean
and SD
for you directly inside the ggplot()
call.
But to your question, you easily calculate the fold change prior to passing to ggplot()
by doing a mutate()
where you set vol[reg == "control"]
as the denominator. Then you can format the y
axis using {scales}.
library(tidyverse)
library(scales)
dd <- data.frame(id = rep(c(1,2,3), 2),
vol = c(10,5,8,11,10,9),
reg = rep(c('control', 'new'), each = 3))
# original plot using stat_summary to avoid transforming data
dd %>%
ggplot(aes(reg, vol)) +
stat_summary(geom = "bar", fun = mean) +
stat_summary(geom = "errorbar", fun.data = mean_cl_normal, fun.args = list(mult = 1))
# calculate % of control
dd %>%
mutate(norm_vol = vol/mean(vol[reg == "control"])) %>%
ggplot(aes(reg, norm_vol)) +
stat_summary(geom = "bar", fun = mean) +
stat_summary(geom = "errorbar", fun.data = mean_cl_normal, fun.args = list(mult = 1)) +
scale_y_continuous(labels = scales::percent_format())
Created on 2022-02-21 by the reprex package (v2.0.1)
How to calculate standard error instead of standard deviation in ggplot
A couple of things. First, you need to reassign e
when you add geom_violin
and stat_summary
. Otherwise, it isn't carrying those changes forward when you add the boxplot in the next step. Second, when you add the boxplot last, it is mapping over the points and error bars from stat_summary
so it looks like they're disappearing. If you add the boxplot first and then stat_summary
the points and error bars will be placed on top of the boxplot. Here is an example:
library(ggplot2)
library(ggpubr)
library(Hmisc)
data("ToothGrowth")
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
theme_set(
theme_classic() +
theme(legend.position = "top")
)
# Initiate a ggplot
e <- ggplot(ToothGrowth, aes(x = dose, y = len))
# Add violin plot
e <- e + geom_violin(trim = FALSE)
# Combine with box plot to add median and quartiles
# Change fill color by groups, remove legend
e <- e + geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2)+
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07"))+
theme(legend.position = "none")
# Add mean points +/- SE
# Use geom = "pointrange" or geom = "crossbar"
e +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
You said in a comment that you couldn't see any changes when you tried mean_se
and mean_cl_normal
. Perhaps the above solution will have solved the problem, but you should see a difference. Here is an example just comparing mean_se
and mean_sdl
. You should notice the error bars are smaller with mean_se
.
ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_sdl", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
Here is a simplified solution if you don't want to reassign at each step:
ggplot(ToothGrowth, aes(x = dose, y = len)) +
geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2) +
stat_summary(fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black") +
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07")) +
theme(legend.position = "none")
Error bars look huge in R, but not in Excel
As the other answer points out, you should be looking at the standard error (sd/sqrt(n)
) rather than the standard deviation. Here is a slightly more compact way to run your code, using stat_summary()
to compute the summary statistics (mean_cl_normal
normally plots the Normal 95% CIs, mult = 1
tells it to plot ±1 SE instead). If you want the end-caps on your error bars to be narrower, use the width=
argument to adjust them.
(My plot still has large error bars but I assume that's because of the size of your reproducible example.)
library(tidyverse)
filter(dat, Condition != "z" & Environment != "a") %>%
mutate(across(Gate = fct_inorder)) %>%
ggplot(aes(Gate, Correct, colour = Sound)) +
stat_summary(geom="line", fun = mean) +
stat_summary(geom="errorbar", fun.data = \(x) mean_cl_normal(x, mult=1)) +
facet_wrap(~ Block)
Using geom_pointrange() to plot means and standard errors
It is easier to check this, if you can provide the actual dataframe descriptive_blp_data
. Running your code with some arbitrary dataset does work as intended and produces error bars, so there is nothing really wrong with the ggplot part.
There may be a few reasons why this does not work with your actual dataset - maybe the standard errors are too small to show up with a point size of 5?
descriptive_blp_data <- data.frame(
"group" = c("Group_3", "Group_2", "Group_1"),
"mean_blp" = c(150, 50, -50),
"se_blp" = c(40, 20, 30)
)
library(ggplot2)
ggplot(descriptive_blp_data) +
aes(x = group, y = mean_blp, colour = group, size = 5) +
geom_pointrange(aes(ymin = mean_blp - se_blp, ymax = mean_blp + se_blp), width=.2,
position=position_dodge(.9)) +
scale_color_manual(
values = list(
Group_2 = "#9EBCDA",
Group_3 = "#8856A7",
Group_1 = "#E0ECF4"
)
) +
labs(y = "Mean BLP score (SE)") +
coord_flip() +
theme_classic() +
theme(legend.position = "none", axis.title.y = element_blank()) +
ylim(-218, 218)
Related Topics
R: How to Get the Week Number of the Month
Bigrams Instead of Single Words in Termdocument Matrix Using R and Rweka
How to Access and Edit Rprofile
How to Determine the Namespace of a Function
Stop an R Program Without Error
Smaller Gap Between Two Legends in One Plot (E.G. Color and Size Scale)
Reading Global Variables Using Foreach in R
Join Two Data Frames in R Based on Closest Timestamp
Install.Packages Fails in Knitr Document: "Trying to Use Cran Without Setting a Mirror"
Ggplot Geom_Bar: Meaning of Aes(Group = 1)
Differencebetween Cat and Print
R Color Palettes for Many Data Classes
Producing a Vector Graphics Image (I.E. Metafile) in R Suitable for Printing in Word 2007
Any Way to Make Plot Points in Scatterplot More Transparent in R