Add Error Bars to Show Standard Deviation on a Plot in R

Add error bars to show standard deviation on a plot in R

A Problem with csgillespie solution appears, when You have an logarithmic X axis. The you will have a different length of the small bars on the right an the left side (the epsilon follows the x-values).

You should better use the errbar function from the Hmisc package:

d = data.frame(
x = c(1:5)
, y = c(1.1, 1.5, 2.9, 3.8, 5.2)
, sd = c(0.2, 0.3, 0.2, 0.0, 0.4)
)

##install.packages("Hmisc", dependencies=T)
library("Hmisc")

# add error bars (without adjusting yrange)
plot(d$x, d$y, type="n")
with (
data = d
, expr = errbar(x, y, y+sd, y-sd, add=T, pch=1, cap=.1)
)

# new plot (adjusts Yrange automatically)
with (
data = d
, expr = errbar(x, y, y+sd, y-sd, add=F, pch=1, cap=.015, log="x")
)

Add error bars to multiple lines to show standard deviation on a plot in R

arrows is a vectorized function. So there is a possibility to avoid mapply call. Consider (I have also replaced your first mapply call by matplot):

## generate example data
set.seed(0)
mat <- matrix(runif(25), 5, 5) ## data to plot
stdev <- matrix(runif(25,0,0.1), 5, 5) ## arbitrary standard error
low <- mat - stdev ## lower bound
up <- mat + stdev ## upper bound

x <- seq(0,1,1/4) ## x-locations to plot against
## your colour setting; should have `ncol(mat)` colours
## as an example I just use `cols = 1:ncol(mat)`
cols <- 1:ncol(mat)
## plot each column of `mat` one by one (set y-axis limit appropriately)
matplot(x, mat, col = cols, pch = 1:5, type = "o", ylim = c(min(low), max(up)))
xx <- rep.int(x, ncol(mat)) ## recycle `x` for each column of `mat`
repcols <- rep(cols, each = nrow(mat)) ## recycle `col` for each row of `mat`
## adding error bars using vectorization power of `arrow`
arrows(xx, low, xx, up, col = repcols, angle = 90, length = 0.03, code = 3)

Sample Image

How can I add already calculated standard error values to each bar in a bar plot (ggplot)?

I think you need to reshape your dataframe in order to make your data simpler to use in gglot2.

When it is about to reshape data into a longer format with multiples columns names as output, I prefered to use melt function from data.table package. But you can get a similar result with pivot_longer function from tidyr.

At the end, your dataset should look like this:

library(data.table)
DF <- as.data.frame(t(DF))
DF$Gene <- rownames(DF)

DF.m <- melt(setDT(DF), measure = list(grep("Control_",colnames(DF)),grep("Std.error",colnames(DF))),
value.name = c("Control","SD"))

Gene variable Control SD
1: Gene1 1 -0.017207751 0.007440363
2: Gene2 1 0.025987401 0.010239336
3: Gene3 1 0.018122943 0.008892864
4: Gene4 1 -0.022694115 0.007286011
5: Gene5 1 0.031315514 0.008674407
6: Gene6 1 -0.016374358 0.007140279
7: Gene1 2 -0.009390680 0.004574254
8: Gene2 2 0.025625772 0.006950560
9: Gene3 2 0.012997113 0.006541982
10: Gene4 2 -0.009823328 0.004776522
11: Gene5 2 0.013967722 0.006746620
12: Gene6 2 -0.009660298 0.004536602

Then, you can easily plot with ggplot2 by using geom_errorbar for standard deviation of each genes.

library(ggplot2)

ggplot(DF.m, aes(x = Gene, y= Control, fill = as.factor(variable)))+
geom_col(position = position_dodge())+
geom_errorbar(aes(ymin = Control-SD,ymax = Control+SD), position = position_dodge(0.9), width = 0.2)+
scale_fill_discrete(name = "Disease", labels = c("Crohns", "UC"))

Sample Image

Does it answer your question ?

Set error bars to standard deviation on a ggplot2 bar graph

mean_sdl takes an argument mult which specifies the number of standard deviations - by default it is mult = 2. So you need to pass mult = 1:

plt <- ggplot(diamonds, aes(cut, price, fill = color)) +
geom_bar(stat = "summary", fun.y = "mean",
position = position_dodge(width = 0.9)) +
geom_errorbar(stat = "summary", fun.data = "mean_sdl",
fun.args = list(mult = 1),
position = position_dodge(width = 0.9)) +
ylab("mean price") +
ggtitle("Two-Factor Dynamite plot")

plt

geom_errorbar() cannot read standard deviation as numerical value and would not add error bars

Note that using geom_errorbar like this you can fix doing geom_errorbar((aes(ymin=Measurement_1-sd(Measurement_1), ymax=Measurement+sd(Measurement_1)))) but you do get on every group the same bar, it does not do it group wise.

I recommend using this instead, which will only show your errors for "Dante" in group "A" as your sample data has only one value for the other groups making SD=0.

ggplot(blue, aes(x = reorder_M1, y = Measurement_1, fill = Group)) +
stat_summary(fun = mean, geom = "bar", position = "dodge") +
stat_summary(fun.data = "mean_se", geom = "errorbar", position = position_dodge(width = 0.90), width = 0.3)

Sample Image

How to plot standard error bars from a dataframe?

I think the trick here is you need to have a single SE column.

Dataset<- c("MOD", "IP", "MP","CC")
GPP <- c(0.6922179, 0.848324, 0.8363999,0.8783096)
NPP<-c(0.4010816,0.4290893, 0.4197423,0.4368065)
df <- data.frame(Dataset,GPP,NPP)
df.m<-reshape2::melt(df)

SEGPP<-c(0.25, 0.15,0.16,0.16)
SENPP<-c(0.15, 0.06,0.08,0.07)
df.m$SE <- c(SEGPP, SENPP)

And then to make the plot you can use geom_errorbar where ymin and ymax are defined as the value plus the SE. Using position_dodge(0.9) to align the SE lines with the bars is talked about in this answer.

ggplot(df.m, aes(Dataset, value, fill = variable)) +
geom_bar(stat = 'identity', position = position_dodge()) +
geom_errorbar(aes(ymin = value - SE, ymax = value + SE), position = position_dodge(0.9), width = 0.25)

Sample Image

How to calculate standard error instead of standard deviation in ggplot

A couple of things. First, you need to reassign e when you add geom_violin and stat_summary. Otherwise, it isn't carrying those changes forward when you add the boxplot in the next step. Second, when you add the boxplot last, it is mapping over the points and error bars from stat_summary so it looks like they're disappearing. If you add the boxplot first and then stat_summary the points and error bars will be placed on top of the boxplot. Here is an example:

library(ggplot2)
library(ggpubr)
library(Hmisc)

data("ToothGrowth")
ToothGrowth$dose <- as.factor(ToothGrowth$dose)

theme_set(
theme_classic() +
theme(legend.position = "top")
)

# Initiate a ggplot
e <- ggplot(ToothGrowth, aes(x = dose, y = len))

# Add violin plot
e <- e + geom_violin(trim = FALSE)

# Combine with box plot to add median and quartiles
# Change fill color by groups, remove legend
e <- e + geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2)+
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07"))+
theme(legend.position = "none")

# Add mean points +/- SE
# Use geom = "pointrange" or geom = "crossbar"
e +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)

You said in a comment that you couldn't see any changes when you tried mean_se and mean_cl_normal. Perhaps the above solution will have solved the problem, but you should see a difference. Here is an example just comparing mean_se and mean_sdl. You should notice the error bars are smaller with mean_se.

ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_sdl", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)
ggplot(ToothGrowth, aes(x = dose, y = len)) +
stat_summary(
fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black"
)

Here is a simplified solution if you don't want to reassign at each step:

ggplot(ToothGrowth, aes(x = dose, y = len)) + 
geom_violin(aes(fill = dose), trim = FALSE) +
geom_boxplot(width = 0.2) +
stat_summary(fun.data = "mean_se", fun.args = list(mult = 1),
geom = "pointrange", color = "black") +
scale_fill_manual(values = c("#00AFBB", "#E7B800", "#FC4E07")) +
theme(legend.position = "none")

Scatter plot with error bars

First of all: it is very unfortunate and surprising that R cannot draw error bars "out of the box".

Here is my favourite workaround, the advantage is that you do not need any extra packages. The trick is to draw arrows (!) but with little horizontal bars instead of arrowheads (!!!). This not-so-straightforward idea comes from the R Wiki Tips and is reproduced here as a worked-out example.

Let's assume you have a vector of "average values" avg and another vector of "standard deviations" sdev, they are of the same length n. Let's make the abscissa just the number of these "measurements", so x <- 1:n. Using these, here come the plotting commands:

plot(x, avg,
ylim=range(c(avg-sdev, avg+sdev)),
pch=19, xlab="Measurements", ylab="Mean +/- SD",
main="Scatter plot with std.dev error bars"
)
# hack: we draw arrows but with very special "arrowheads"
arrows(x, avg-sdev, x, avg+sdev, length=0.05, angle=90, code=3)

The result looks like this:

example scatter plot with std.dev error bars

In the arrows(...) function length=0.05 is the size of the "arrowhead" in inches, angle=90 specifies that the "arrowhead" is perpendicular to the shaft of the arrow, and the particularly intuitive code=3 parameter specifies that we want to draw an arrowhead on both ends of the arrow.

For horizontal error bars the following changes are necessary, assuming that the sdev vector now contains the errors in the x values and the y values are the ordinates:

plot(x, y,
xlim=range(c(x-sdev, x+sdev)),
pch=19,...)
# horizontal error bars
arrows(x-sdev, y, x+sdev, y, length=0.05, angle=90, code=3)


Related Topics



Leave a reply



Submit