Storing Ggplot Objects in a List from Within Loop in R

Storing ggplot objects in a list from within loop in R

In addition to the other excellent answer, here’s a solution that uses “normal”-looking evaluation rather than eval. Since for loops have no separate variable scope (i.e. they are performed in the current environment) we need to use local to wrap the for block; in addition, we need to make i a local variable — which we can do by re-assigning it to its own name1:

myplots <- vector('list', ncol(data2))

for (i in seq_along(data2)) {
message(i)
myplots[[i]] <- local({
i <- i
p1 <- ggplot(data2, aes(x = data2[[i]])) +
geom_histogram(fill = "lightgreen") +
xlab(colnames(data2)[i])
print(p1)
})
}

However, an altogether cleaner way is to forego the for loop entirely and use list functions to build the result. This works in several possible ways. The following is the easiest in my opinion:

plot_data_column = function (data, column) {
ggplot(data, aes_string(x = column)) +
geom_histogram(fill = "lightgreen") +
xlab(column)
}

myplots <- lapply(colnames(data2), plot_data_column, data = data2)

This has several advantages: it’s simpler, and it won’t clutter the environment (with the loop variable i).


1 This might seem confusing: why does i <- i have any effect at all? — Because by performing the assignment we create a new, local variable with the same name as the variable in the outer scope. We could equally have used a different name, e.g. local_i <- i.

Trying to make a list of ggplot objects in a for loop; all items in list are written as last iteration from loop

You can create a list of plots directly using sapply. For example:

plist = sapply(names(mtcars)[-grep("mpg", names(mtcars))], function(col) {
ggplot(mtcars, aes_string(x = "mpg", y = col)) + geom_smooth() + geom_point()
}, simplify=FALSE)

The list elements (each of which is a ggplot object) will be named after the y variable in the plot:

names(plist)
[1] "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

You can print all the plots by typing plist in the console. Or, for a single plot, just select the plot you want:

plist[["hp"]]

Sample Image

For a situation like this, you might prefer faceting, which requires converting the data from wide to long format. You can have facets with different y scales by setting scales="free_y".

library(tidyverse)

ggplot(gather(mtcars, key, value, -mpg), aes(mpg, value)) +
geom_smooth() + geom_point() +
facet_wrap(~ key, scales="free_y", ncol=5)

Sample Image

How to store plots in a list when inside a loop?

This answer is based on: Storing plot objects in a list

library(ggplot2)
library(gridExtra)

plist <- list()

for (z in 1:5){
n <- 100
k <- seq(0, to=4500+z*2000, length=n)
tmp <- numeric(n)
for (i in 1:n){
tmp[i] <- (5*(i*3)^2)}

data <- data.frame(n, k, tmp)

plist[[z]] <- ggplot(data = data) + #data needs to be given!!
geom_line(aes(x = k, y = tmp)) +
theme_bw()

pdf(sprintf("p%s.pdf", z),
width = 6, height = 4, onefile = T)
plot(plist[[z]])
dev.off()
}

do.call(grid.arrange, c(plist, ncol = 5))

Sample Image

Plots are not stored in list during loop

You can use the following code -

library(ggplot2)

myplots <- vector('list', ncol(data2))

for (i in seq_along(data2)) {
myplots[[i]] <- ggplot(data2, aes(x = .data[[colnames(data2)[i]]])) +
geom_histogram(fill = "lightgreen")
}

However, using lapply would be easier.

myplots <- lapply(names(data2), function(x)  
ggplot(data2, aes(x = .data[[x]])) + geom_histogram(fill = "lightgreen"))

Plot the list of plots with grid.arrange.

gridExtra::grid.arrange(grobs = myplots)

data

A <- c(2, 4, 1, 2, 5, 1, 2, 0, 1, 4, 4, 3, 5, 2, 4, 3, 3, 6, 5, 3, 6, 4, 3, 4, 4, 3, 4, 
2, 4, 3, 3, 5, 3, 5, 5, 0, 0, 3, 3, 6, 5, 4, 4, 1, 3, 3, 2, 0, 5, 3, 6, 6, 2, 3)
B <- c(2, 4, 4, 0, 4, 4, 4, 4, 1, 4, 4, 3, 5, 0, 4, 5, 3, 6, 5, 3, 6, 4, 4, 2, 4, 4, 4,
1, 1, 2, 2, 3, 3, 5, 0, 3, 4, 2, 4, 5, 5, 4, 4, 2, 3, 5, 2, 6, 5, 2, 4, 6, 3, 3)
C <- c(2, 5, 4, 1, 4, 2, 3, 0, 1, 3, 4, 2, 5, 1, 4, 3, 4, 6, 3, 4, 6, 4, 1, 3, 5, 4, 3,
2, 1, 3, 2, 2, 2, 4, 0, 1, 4, 4, 3, 5, 3, 2, 5, 2, 3, 3, 4, 2, 4, 2, 4, 5, 1, 3)
data2 <- data.frame(A,B,C)

R assigning ggplot objects to list in loop

The answers so far are very close, but unsatisfactory in my opinion. The problem is the following - after your for loop:

myplots[[1]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, i]
myplots[[1]]$plot_env
#<environment: R_GlobalEnv>

myplots[[2]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, i]
myplots[[2]]$plot_env
#<environment: R_GlobalEnv>

i
#[1] "B"

As the other answers mention, ggplot doesn't actually evaluate those expressions until plotting, and since these are all in the global environment, and the value of i is "B", you get the undesirable results.

There are several ways of avoiding this issue, the simplest of which in fact simplifies your expressions:

myplots = lapply(v, function(col)
ggplot(dfrm, aes(x=1:dfmsize, y=dfrm[,col])) + geom_point() + labs(y=col))

The reason this works, is because the environment is different for each of the values in the lapply loop:

myplots[[1]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, col]
myplots[[1]]$plot_env
#<environment: 0x000000000bc27b58>

myplots[[2]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, col]
myplots[[2]]$plot_env
#<environment: 0x000000000af2ef40>

eval(quote(dfrm[, col]), env = myplots[[1]]$plot_env)
#[1] 1 2 3 4 5 6 7 8 9 10
eval(quote(dfrm[, col]), env = myplots[[2]]$plot_env)
#[1] 10 9 8 7 6 5 4 3 2 1

So even though the expressions are the same, the results are different.

And in case you're wondering what exactly is stored/copied to the environment of lapply - unsurprisingly it's just the column name:

ls(myplots[[1]]$plot_env)
#[1] "col"

Storing ggplots in loop makes all plots the last one

Here is the solution you are looking for

createPlot = function(i){
dfs = tibble(x = 0, xend = 3*sin((1:12)[i]*pi/6), y=0, yend = 3*cos((1:12)[i]*pi/6))
dfl = tibble(x = 0, xend = 4.5*sin((1:60)[i]*pi/30), y=0, yend =4.5*cos((1:60)[i]*pi/30))
p1 = ggplot()+
geom_segment(aes(x, y, xend = xend, yend = yend), data=dfs, colour = "black", size = 3)+
geom_segment(aes(x, y, xend = xend, yend = yend), data=dfl, colour = "black", size = 3)
}

multiplePlots = tibble(index = 1:5) %>%
rowwise() %>%
mutate(clockPlots = list(createPlot(index)))

lapply(multiplePlots$clockPlots, plot)

Here are a few selected charts
Sample Image

Sample Image

Sample Image

ggplots stored in plot list to respect variable values at time of plot generation within for loop

I propose this solution which doesn't tell you why it doesn't work like you do :

l <- lapply(choice_COLs, temporary_function)

temporary_function <- function(COL_i){
COL_i_index <- which(COL_i == COLs_current)

# Generate "basis boxplot" (to plot scatterplot on top)
boxplot_scores <- data_temp %>%
gather(COL, score, all_of(COLs_current)) %>%
ggplot(aes(x = COL, y = score)) +
geom_boxplot()

# Get relevant data of COL_i for scattering: data of 4th quartile
quartile_values <- quantile(data_temp[[COL_i]])
threshold <- quartile_values["75%"] # threshold = 3. quartile value
data_temp_filtered <- data_temp %>%
filter(data_temp[[COL_i]] > threshold) %>% # filter the data of the 4th quartile
dplyr::select(COLs_current)

# Create layer of scatter for 4th quartile of COL_i
scatter <- geom_point(data=data_temp_filtered,
mapping = aes(x = COL_i_index,
y = eval(parse(text=(paste0(COL_i))))),
color= "orange")

# add geom objects to create final plot for COL_i
current_plot_complete <- boxplot_scores + scatter

return(current_plot_complete)
}

When you use lapply you don't have such a problem.
It is inspired by this post



Related Topics



Leave a reply



Submit