Plotting a "Sequence Logo" Using Ggplot2

Plotting a sequence logo using ggplot2?

ggseqlogo should be what you're looking for. I hope this can relieve some of the frustrations I’m sure many of you have when it comes to plotting sequence logos in R

How can I control the font size of sequence logos?

Actually, how about something like this?

require(gridExtra)
AA_alphabet <- c('R','H','K','D','E','S','Y','T','N','Q','C','G','P','W','A','V','I','L','M','F')

AA1 = c('RHKDES', 'RHKDES', 'RHKDGP', 'RHKDGP', 'TNQCGP')
AA2 = c('RH', 'RH', 'RH', 'TN', 'TN')

p1 = ggplot() + geom_logo(AA1, method='p', seq_type='other', namespace=AA_alphabet)+theme_logo()
p2 = ggplot() + geom_logo(AA2, method='p', seq_type='other', namespace=AA_alphabet)+theme_logo()

p3 = grid.arrange(p1, p2, ncol=1)
print(p3)

You can add as many plots as you want to grid.arrange, as well as adjust the layout of rows and columns using ncol and nrow parameters.

R: Nice way to show ggplots on x and y-axis of another ggplot

The package ggExtra is capable of plotting plots on both axis of a scatterplot, as stated in their manual:

ggExtra is a collection of functions and layers to enhance ggplot2.
The flagship function is ggMarginal, which can be used to add marginal
histograms/boxplots/density plots to ggplot2 scatterplots.

Unfortunately, I could not find a function to provide the plots myself therefore I inspected the source code and came up with this simple solution:

library(ggExtra)
grob <- ggplot2::ggplotGrob(heat.map)
grob <- ggExtra:::addTopMargPlot(grob, top = logo, size = 10)
grob <- ggExtra:::addRightMargPlot(grob, right = logo + coord_flip(), size = 10)
plot(grob)

Sample Image

Hopefully it will help others!

Sequence index plots in ggplot2 using geom_tile( )

Two small changes:

mvad_long$id <- as.factor(mvad_long$id)
ggplot(data=mvad_long,aes(x=Month,y=id,fill=state))+
geom_tile()+facet_wrap(~cluster,scales = "free_y")

ggplot was treating id as a numerical variable, rather than a factor, and then the scales were fixed.

Stack bars in different order for each stack

For this type of plot I would construct a data set to be plotted by geom_rect. For an example, the data set below is constructed such that the ymin and ymax of a rectangle are defined by the order of the ACGT at each position. The graphic below may not look much like the one that you have in your question, but the method provided should produce a similar graphic given your data set. You may need to tweak geom_text values and colors, but I think the primary question to answer is how to construct a data set and plot it.

library(ggplot2)
library(dplyr)
library(tidyr)

# Make some random data
set.seed(42)
rna_seq_data <-
data_frame(position = 1:25,
A = floor(runif(25, 0, 5000)),
C = floor(runif(25, 0, 5000)),
G = floor(runif(25, 0, 5000)),
T = floor(runif(25, 0, 5000)))

tidyr::gather(rna_seq_data, key, value, -position) %>%
dplyr::group_by(position) %>%
dplyr::mutate(order = rank(value)) %>%
dplyr::arrange(position, order) %>%
dplyr::mutate(ymin = dplyr::if_else(order == 1, 0, lag(value)),
ymax = cumsum(value),
xmin = position - 0.45,
xmax = position + 0.45) %>%
dplyr::ungroup() %>%
ggplot(.) +
aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, fill = key) +
geom_rect()

Sample Image

Plotting Bar Chart in custom order and color sequence using ggplot library in R

I figured out the issues and have posted my answer inspired by @cymon's and @Ronak Shah's solutions.

library(ggplot2)

z <- data.frame(group = c("trtA", "trtAB", "trtB", "veh"),
Mean = c(42.990, 50.955, 34.235, 31.992),
sd = c(15.541, 18.325, 9.737, 12.463))

colorVals = c("black", "red", "blue", "purple")

# Before assigning new values to existing 'group' values

ggplot(z, aes(x=group, y=Mean, fill = group )) +
geom_bar(stat="identity", position=position_dodge()) +
geom_errorbar(aes(ymin=Mean-sd, ymax=Mean+sd), width=.4) +
geom_text(aes(label=round(Mean,2), fontface = "bold"), vjust = -0.5) +
scale_fill_manual(values=colorVals) +
labs(x = "Treatment Group", y = "Signal Value") +
theme_classic(base_size = 14) +
theme(legend.position='none') +
theme(axis.text.x = element_text(angle = 20, hjust = 1)) +
scale_fill_manual("legend", values = colorVals)

# After assigning new values to existing 'group' values

GroupA <- "Vehicle"
GroupB <- "Treatment-A"
GroupC <- "Treatment-B"
GroupD <- "Treatment-AB"

names(colorVals) <- c(GroupA, GroupB, GroupC, GroupD)

z$group[z$group == "veh"] <- GroupA
z$group[z$group == "trtA"] <- GroupB
z$group[z$group == "trtB"] <- GroupC
z$group[z$group == "trtAB"] <- GroupD

z$group <- factor(z$group, ordered=TRUE, levels=names(colorVals))

ggplot(z, aes(x=group, y=Mean, fill = group )) +
geom_bar(stat="identity", position=position_dodge()) +
geom_errorbar(aes(ymin=Mean-sd, ymax=Mean+sd), width=.4) +
geom_text(aes(label=round(Mean,2), fontface = "bold"), vjust = -0.5) +
scale_fill_manual(values=colorVals) +
labs(x = "Treatment Group", y = "Signal Value") +
theme_classic(base_size = 14) +
theme(legend.position='none') +
theme(axis.text.x = element_text(angle = 20, hjust = 1)) +
scale_fill_manual("legend", values = colorVals)

TramineR sequence plot with ggplot2

The online help page of seqplot (of which seqdplot is an alias for type="d") states

A State distribution plot (type="d") represents the sequence of the
cross-sectional state frequencies by position (time point) computed by
the seqstatd function and rendered with the plot.stslist.statd method.
Such plots are also known as chronograms.

So you get the data used by seqdplot with function seqstatd. Actually, the distributions are in the attribute Frequencies.

Your sample data contains only three sequences of length 10 with a single spell in state 'OT'. I stored it in s.spl

s.spl
# Sequence
# 1 OT-OT-OT-OT-OT-OT-OT-OT-OT-OT
# 2 OT-OT-OT-OT-OT-OT-OT-OT-OT-OT
# 3 OT-OT-OT-OT-OT-OT-OT-OT-OT-OT

The distributions by position are

sd <- seqstatd(s.spl)
sd$Frequencies
# 04:00 04:10 04:20 04:30 04:40 04:50 05:00 05:10 05:20 05:30
# PC 0 0 0 0 0 0 0 0 0 0
# SL 0 0 0 0 0 0 0 0 0 0
# EA 0 0 0 0 0 0 0 0 0 0
# WR 0 0 0 0 0 0 0 0 0 0
# ST 0 0 0 0 0 0 0 0 0 0
# DI 0 0 0 0 0 0 0 0 0 0
# FP 0 0 0 0 0 0 0 0 0 0
# FO 0 0 0 0 0 0 0 0 0 0
# LA 0 0 0 0 0 0 0 0 0 0
# IR 0 0 0 0 0 0 0 0 0 0
# HO 0 0 0 0 0 0 0 0 0 0
# CH 0 0 0 0 0 0 0 0 0 0
# CA 0 0 0 0 0 0 0 0 0 0
# LE 0 0 0 0 0 0 0 0 0 0
# CO 0 0 0 0 0 0 0 0 0 0
# TV 0 0 0 0 0 0 0 0 0 0
# RA 0 0 0 0 0 0 0 0 0 0
# TR 0 0 0 0 0 0 0 0 0 0
# OT 1 1 1 1 1 1 1 1 1 1

Good luck if want to rewrite TraMineR's plotting facilities with ggplot



Related Topics



Leave a reply



Submit