How to Change the Formatting of Numbers on an Axis with Ggplot

How do I change the formatting of numbers on an axis with ggplot?

I also found another way of doing this that gives proper 'x10(superscript)5' notation on the axes. I'm posting it here in the hope it might be useful to some. I got the code from here so I claim no credit for it, that rightly goes to Brian Diggs.

fancy_scientific <- function(l) {
# turn in to character string in scientific notation
l <- format(l, scientific = TRUE)
# quote the part before the exponent to keep all the digits
l <- gsub("^(.*)e", "'\\1'e", l)
# turn the 'e+' into plotmath format
l <- gsub("e", "%*%10^", l)
# return this as an expression
parse(text=l)
}

Which you can then use as

ggplot(data=df, aes(x=x, y=y)) +
geom_point() +
scale_y_continuous(labels=fancy_scientific)

Formatting numbers on axis when using ggplot2

Exploiting the fact that you can give a (lambda) function as the labels argument, you can just reconvert the character label to numeric before passing it on to scales::number.

library(ggplot2)
library(scales)
library(dplyr)

RN <- sample(1:1000,1000,replace=TRUE)
RN <- RN/1000
breaks <- c(seq(from=0, to=1, by=0.05))
DF <- data.frame(RN)
DF$DisRN <- cut(DF$RN,breaks=c(breaks,Inf),labels=as.numeric(breaks))
DF_Plot <- DF %>% group_by(DisRN) %>% summarise(cnt=n())


ggplot(DF_Plot,aes(y=cnt,x=DisRN)) +
geom_col(position="dodge") +
scale_x_discrete(
labels = ~ number(as.numeric(.x), accuracy = 0.01)
)

Sample Image

Created on 2022-01-11 by the reprex package (v2.0.1)

You can leave out some breaks by setting the breaks argument of the scale to, for example, breaks = seq(0, 0.9, by = 0.1).

Number formatting axis labels in ggplot2?

One needs to load library(scales) before attempting this.

How can I replace numbers on an axis with symbols in ggplot?

You can use a nested gsub() in the labels argument. Here, \u2212 is the unicode minus symbol, but you can use a hyphen too. The regex bit ^ indicates 'starts with', whereas $ indicates 'ends with', so ^0$ should match exact 0s.

library(tidyverse)

exampledf <- structure(
list(male_diff = c(0.4668, -0.03299, 0.702, -0.11544, 0.689, 0.511, -0.0725,
-0.12844, -0.0827, 0.6515, -0.01077, 0.006, 0.0041,
-0.00856, 0.4181, -0.02765),
female_diff = c(-0.459, 0.022, -0.155, 0.00800000000000001, -0.156,
-0.326, -0.0224, 0.00700000000000001, 0.0399999999999999,
0.2182, -0.08458, 0.8844, -0.8459, 0.122, -0.506, 0.03),
expt_in_ms = c(1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4),
nloci = c(0, 20, 0, 20, 0, 0, 0, 20, 20, 20, 0, 20, 0, 20, 0, 20),
index = 1:16),
row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame")
)

ggplot(exampledf, aes(x = male_diff, y = index, color = as.factor(expt_in_ms))) +
geom_point(aes(alpha = 0.5, size = 3)) +
geom_point(aes(x = female_diff),alpha = 0.8, shape =24, size = 5) +
scale_y_continuous(
breaks = 1:nrow(exampledf),
labels = gsub("^0$", "+", gsub("^20$", "\u2212", exampledf$nloci))
) +
xlab("difference in proportion") +
ylab("correlation") +
geom_vline(
xintercept = 0,
color = 'black',
linetype = 'dashed',
alpha = .5
) +
theme_minimal() +
theme(text = element_text(size = 12)) +
guides(alpha="none", size = "none", shape = "none", color = "none")

Sample Image

Created on 2021-08-31 by the reprex package (v1.0.0)

Formatting numbers on a continuous axis in ggplot

french = function(x) format(x, big.mark = " ")

p + scale_y_continuous(labels=french)

Formatting ggplot2 axis labels with commas (and K? MM?) if I already have a y-scale

For the comma formatting, you need to include the scales library for label=comma. The "error" you discussed is actually just a warning, because you used both ylim and then scale_y_continuous. The second call overrides the first. You can instead set the limits and specify comma-separated labels in a single call to scale_y_continuous:

library(scales)

ggplot(df, aes(x = Date, y = Cost))+
geom_line(lwd = 0.5) +
geom_line(aes(y = Cost_7), col = 'red', linetype = 3, lwd = 1) +
geom_line(aes(y = Cost_30), col = 'blue', linetype = 5, lwd = 0.75) +
xlim(c(left, right)) +
xlab("") +
scale_y_continuous(label=comma, limits=c(min(df$Cost[df$Date > left]),
max(df$Cost[df$Date > left])))

Another option would be to melt your data to long format before plotting, which reduces the amount of code needed and streamlines aesthetic mappings:

library(reshape2)

ggplot(melt(df, id.var="Date"),
aes(x = Date, y = value, color=variable, linetype=variable))+
geom_line() +
xlim(c(left, right)) +
labs(x="", y="Cost") +
scale_y_continuous(label=comma, limits=c(min(df$Cost[df$Date > left]),
max(df$Cost[df$Date > left])))

Either way, to put the y values in terms of thousands or millions you could divide the y values by 1,000 or 1,000,000. I've used dollar_format() below, but I think you'll also need to divide by the appropriate power of ten if you use unit_format (per @joran's suggestion). For example:

div=1000

ggplot(melt(df, id.var="Date"),
aes(x = Date, y = value/div, color=variable, linetype=variable))+
geom_line() +
xlim(c(left, right)) +
labs(x="", y="Cost (Thousands)") +
scale_y_continuous(label=dollar_format(),
limits=c(min(df$Cost[df$Date > left]),
max(df$Cost[df$Date > left]))/div)

Use scale_color_manual and scale_linetype_manual to set custom colors and linetypes, if desired.

Sample Image

Why does ggplot suddenly change formate of y-axis numbers to exponential?

This can be further simplified to a comparison of the two sequences directly:

options(scipen = 20) # Show more digits before resorting
# to scientific notation

seq(-0.2,0.35,0.05)
[1] -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35


seq(-0.3,0.35,0.05)
[1] -0.29999999999999998889777 -0.25000000000000000000000 -0.19999999999999998334665
[4] -0.14999999999999996669331 -0.09999999999999997779554 -0.04999999999999998889777
[7] 0.00000000000000005551115 0.05000000000000004440892 0.10000000000000003330669
[10] 0.15000000000000002220446 0.20000000000000001110223 0.25000000000000005551115
[13] 0.30000000000000009992007 0.34999999999999997779554

Looks like a floating point issue related to this:
Why are these numbers not equal?.

Quick fix could be to to use

round(seq(-0.3,0.35,0.05), digits = 2)
[1] -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

or

scale_y_continuous("Relatedness", 
breaks = seq(-0.3,0.35,0.05),
labels = ~format(round(.x,2), nsmall = 2))


Related Topics



Leave a reply



Submit