List for Multiple Plots from Loop (Ggplot2) - List Elements Being Overwritten

Variable geom_text is overwritten when plots saved in list

Using lapply instead of forloop works fine:

my.list <- lapply(1:2, function(i) {
ggplot(data = dt, aes(x = x, y = y )) +
geom_point(size=1.5) +
labs(x=NULL, y=NULL) +
theme_bw() +
theme(panel.background = element_rect(fill='light grey', colour='black'),
legend.position = "none") +
geom_text(inherit.aes=FALSE,aes(x=50000, y=100000,
label=paste0('NRMSE:',i))) +
ggtitle(paste0(plotname[i]))
})

ggarrange(plotlist = my.list)

Note: the issue is not with ggarrange.


Roland:

The plot is build when you print the ggplot object. Anything that is not part of the data passed will be taken from the enclosing environment at exactly that time point. If you use the iterator of a for loop in the plot, it has its last value then (or any value you change it to later on). lapply avoids the issue because of the stuff explained in the Note in its documentation.


Related post:

the problem is that ggplot() waits until you print the plot to resolve the variables in the aes() command.

Storing ggplot objects in a list from within loop in R

In addition to the other excellent answer, here’s a solution that uses “normal”-looking evaluation rather than eval. Since for loops have no separate variable scope (i.e. they are performed in the current environment) we need to use local to wrap the for block; in addition, we need to make i a local variable — which we can do by re-assigning it to its own name1:

myplots <- vector('list', ncol(data2))

for (i in seq_along(data2)) {
message(i)
myplots[[i]] <- local({
i <- i
p1 <- ggplot(data2, aes(x = data2[[i]])) +
geom_histogram(fill = "lightgreen") +
xlab(colnames(data2)[i])
print(p1)
})
}

However, an altogether cleaner way is to forego the for loop entirely and use list functions to build the result. This works in several possible ways. The following is the easiest in my opinion:

plot_data_column = function (data, column) {
ggplot(data, aes_string(x = column)) +
geom_histogram(fill = "lightgreen") +
xlab(column)
}

myplots <- lapply(colnames(data2), plot_data_column, data = data2)

This has several advantages: it’s simpler, and it won’t clutter the environment (with the loop variable i).


1 This might seem confusing: why does i <- i have any effect at all? — Because by performing the assignment we create a new, local variable with the same name as the variable in the outer scope. We could equally have used a different name, e.g. local_i <- i.

ggplots stored in plot list to respect variable values at time of plot generation within for loop

I propose this solution which doesn't tell you why it doesn't work like you do :

l <- lapply(choice_COLs, temporary_function)

temporary_function <- function(COL_i){
COL_i_index <- which(COL_i == COLs_current)

# Generate "basis boxplot" (to plot scatterplot on top)
boxplot_scores <- data_temp %>%
gather(COL, score, all_of(COLs_current)) %>%
ggplot(aes(x = COL, y = score)) +
geom_boxplot()

# Get relevant data of COL_i for scattering: data of 4th quartile
quartile_values <- quantile(data_temp[[COL_i]])
threshold <- quartile_values["75%"] # threshold = 3. quartile value
data_temp_filtered <- data_temp %>%
filter(data_temp[[COL_i]] > threshold) %>% # filter the data of the 4th quartile
dplyr::select(COLs_current)

# Create layer of scatter for 4th quartile of COL_i
scatter <- geom_point(data=data_temp_filtered,
mapping = aes(x = COL_i_index,
y = eval(parse(text=(paste0(COL_i))))),
color= "orange")

# add geom objects to create final plot for COL_i
current_plot_complete <- boxplot_scores + scatter

return(current_plot_complete)
}

When you use lapply you don't have such a problem.
It is inspired by this post

qplot call overwrites list elements

In ggplot the aesthetics are stored as expressions and evaluated when the plot is rendered. So qplot(i) does not generate a plot, but rather a plot definition, using a reference to the variable i. All three plots are the same in the sense that they all reference i.

If you type

list2[[1]]

after the second loop has run, you cause the ggplot object stored in list2[[1]] to be rendered, using whatever value i is set to at the moment (which is 3 after the loop).

Try this:

i <- 4
list2[[1]]

Now the plot rendered is equivalent to qplot(4).

The workaround depends on what you are trying to achieve. The basic idea is not to use external variables in aesthetics. So in your trivial case,

for(i in 1:3){
list2[[i]]<-ggplot(data.frame(x=i), aes(x))+geom_histogram()
}

will work. This is because the reference to the external variable i is not in the aesthetics (e.g., the call to aes(...).

Writing a loop to create ggplot figures with different data sources and titles

This might do the trick:
Initiate two loops, one for the complex iteration and a second for the dataset iteration. Then use paste0() or paste() to generate the correct filenames and headings.

PS.: I didn't test the code, since I dont have your data. But it should give you an idea.

#loop over complex    
for (c in 1:10) {

#create pdf for every complex
pdf(file = paste0("complex", c, "analysis.pdf"), paper='A4r')

#loop over datasets
for(d in 1:3) {

#plot
ggplot(get(paste0("df_tbl_data",d,"_comp",c)), aes(Size_Range, Abundance, group=factor(Gene_Name))) +
theme(legend.title=element_blank()) +
geom_line(aes(color=factor(Gene_Name))) +
ggtitle(paste0("Data",d," - complex ",c))+
theme(axis.text.x = element_text(angle = 90, hjust = 1))
}
dev.off()

}

Grid of multiple ggplot2 plots which have been made in a for loop

I would be inclined to agree with Richie, but if you want to arrange them yourself:

library(gridExtra)
library(ggplot2)
p <- list()
for(i in 1:4){
p[[i]] <- qplot(1:10,10:1,main=i)
}
do.call(grid.arrange,p)

take a look at the examples at the end of ?arrangeGrob for ways to eliminate the for loop altogether:

plots = lapply(1:5, function(.x) qplot(1:10,rnorm(10),main=paste("plot",.x)))
require(gridExtra)
do.call(grid.arrange, plots)


Related Topics



Leave a reply



Submit