Purrr Map Equivalent of Nested for Loop

purrr map equivalent of nested for loop

As @r2evans points out, the .x from your first call is masked. however you can create a lambda function that takes 2 parameters .x and .y, and assign the previous .x to the new .y through the ... argument.

I'll use walk rather than map as in this case you're only interested in side effects (printing)

walk(1:4,~ walk(1:6, ~ print(paste(.x, .y, sep = "-")),.y=.x))

Another option is to use expand.grid to lay out the combinations, and then iterate on those with pwalk (or pmap in other circumstances)

purrr::pwalk(expand.grid(1:4,1:6),~print(paste(.x, .y, sep = "-")))

Output in both cases:

[1] "1-1"
[1] "2-1"
[1] "3-1"
[1] "4-1"
[1] "5-1"
[1] "6-1"
[1] "1-2"
[1] "2-2"
[1] "3-2"
[1] "4-2"
[1] "5-2"
[1] "6-2"
[1] "1-3"
[1] "2-3"
[1] "3-3"
[1] "4-3"
[1] "5-3"
[1] "6-3"
[1] "1-4"
[1] "2-4"
[1] "3-4"
[1] "4-4"
[1] "5-4"
[1] "6-4"

Nested looping with map from purrr package in R

I would use an anonymous function call here to avoid the confusion and be clear of what is what.

map(0:1, function(x) 
map_df(0:7,function(y)
data.frame(summary(emtrends(fit,
~ses, var= "time", max.degree= 2,at =list(ses=x,time=y))))[1,]))

#[[1]]
# ses degree time.trend SE df lower.CL upper.CL
#1 0 linear 0.34156633 0.06084479 1594 0.22222212 0.4609105
#2 0 linear 0.29736820 0.04502015 1594 0.20906326 0.3856731
#3 0 linear 0.25317006 0.03015484 1594 0.19402275 0.3123174
#4 0 linear 0.20897192 0.01869728 1594 0.17229808 0.2456458
#5 0 linear 0.16477378 0.01864500 1594 0.12820248 0.2013451
#6 0 linear 0.12057564 0.03005757 1594 0.06161911 0.1795322
#7 0 linear 0.07637750 0.04491162 1594 -0.01171454 0.1644695
#8 0 linear 0.03217936 0.06073239 1594 -0.08694438 0.1513031

#[[2]]
# ses degree time.trend SE df lower.CL upper.CL
#1 1 linear 0.43528003 0.08297255 1594 0.27253325 0.5980268
#2 1 linear 0.38631190 0.06139288 1594 0.26589262 0.5067312
#3 1 linear 0.33734376 0.04112141 1594 0.25668603 0.4180015
#4 1 linear 0.28837563 0.02549702 1594 0.23836441 0.3383868
#5 1 linear 0.23940750 0.02542573 1594 0.18953612 0.2892789
#6 1 linear 0.19043936 0.04098878 1594 0.11004179 0.2708369
#7 1 linear 0.14147123 0.06124487 1594 0.02134227 0.2616002
#8 1 linear 0.09250309 0.08281927 1594 -0.06994303 0.2549492

How to convert this nested for loop into a purrr function

For the first purrr call, give the frame_list, and for the second one, give it the names of the data table. I added a blah column to dataframe y, also changed

library(dplyr)
x <- data.frame(first = c(1:3, 6:7),
second = c(1:3, 6:7),
third = c(1:5))

y <- data.frame(differ = c(10:13),
other = c(10:13),
nomatch = rep(5),
blah=rep(5),
another = c(10:13),
argh = c(9:12))

z <- data.frame(nothing = c(20:27),
nope = c(20:27),
noway = c(16,17, 18, 19, 5, 21, 22, 23),
still = c(21:28),
again = c(22:29))

frame_list <- list(first = x,
second = y,
third = z)

check_against <- rep(5, 5)

col_name_list=purrr::map(frame_list, function(w) {
m=purrr::map_lgl(w, function(x) {
return(any(unlist(x) %in% check_against))
})
names(w)[m]
})

col_name_list

output

$first
[1] "third"

$second
[1] "nomatch" "blah"

$third
[1] "noway"

nested map functions with purrr

We can use another loop nested in map to run for different values of "K"

library(tidyverse)
rerun(5, gen_boot_sample(train_data)) %>%
map(~ {
# create the subset datasets
train_data <- .x %>%
select_at(vars(contains('x')))
train_label <- .x %>%
select_at(.vars = vars(contains("y"))) %>%
pull()
test_data <- test_data %>%
select_at(.vars = vars(contains("x")))
# loop over different values for 'K'
map_dbl(1:10, ~ {
#apply the knn function
out <- knn(train_data, train_label, test_data, K = .x)
sum(out - test_label)^2}
)
})

Nested loop on dates using purrr map

If you're looking for a tidyverse solution, I would suggest you this one:

full_dosing <- dosing %>%
mutate(labdat = dosdat) %>%
group_by(ID) %>%
complete(labdat = seq(min(labdat), max(labdat), "day"), ID) %>%
fill(dosdat, dosrec, doslev) %>%
ungroup()

left_join(labs, full_dosing, by = c("ID", "labdat"))
  ID labrec labval     labdat dosrec doslev     dosdat
1 1 1 4.92 2020-06-17 2 0.1 2020-06-15
2 1 2 2.89 2020-06-24 3 0.1 2020-06-22
3 1 3 14.01 2020-07-08 4 0.9 2020-07-07
4 2 1 3.92 2020-06-06 1 0.2 2020-06-05
5 2 2 17.58 2020-06-26 3 0.3 2020-06-24

However, it is less efficient than the data.table solution because you need to complete the dosing dataframe first.


The solution is based on this data:

#' Dosing file
#' ----------------------------------
dosdatID1<-c("2020-06-06", "2020-06-15", "2020-06-22", "2020-07-07", "2020-07-17")
dosdatID2<-c("2020-06-05", "2020-06-08", "2020-06-24", "2020-06-27")
dosing<-data.frame(
ID=c(rep(1, 5), rep(2, 4)),
dosrec=c(1:5, 1:4),
doslev=c(c(0.1, 0.1, 0.1, 0.9, 0.9), c(0.2, 0.2, 0.3, 0.3)),
dosdat=as.Date(c(dosdatID1, dosdatID2)))

#' Lab values file
#' ----------------------------------
labdatID1<-c("2020-06-17", "2020-06-24", "2020-07-08")
labdatID2<-c("2020-06-06", "2020-06-26")
labs<-data.frame(
ID=c(rep(1, 3), rep(2, 2)),
labrec=c(1:3, 1:2),
labval=round(c(rnorm(3, 10, 5), rnorm(2, 15, 5)), 2),
labdat=as.Date(c(labdatID1, labdatID2))
)

Map function to second level of nested list using purrr

The recipe for this kind of problem is always the same:

Decompose the problem, solve it for an individual case, and then put it back together inside out.

As you observed, mtcars %>% split(.$cyl) gives you a list of lists (list of data.frames). You want to map mean over the inner lists.

So let’s do it for one list first:

mtcars_split[[1]] %>% map_dbl(mean)
# Or, equivalently:
map_dbl(mtcars_split[[1]], mean)

This works. We’ve decomposed the problem and successfully solved it for an individual case: In other words, given a list x and a transformation f, we’ve solved the problem for x[[1]] by executing f(x[[1]]) (which is equivalent to x[[1]] %>% f()).

Time to generalise it to all cases. And we already know how to generalise a transformation of an element x[[1]] to a whole list x: use map on that list:

x %>% map(~ .x %>% f())
# or, equivalently:
x %>% map(~ f(.x))
# or, equivalently:
map(x, ~ f(.x))
# or, finally:
map(x, f)

Let’s do the exact same thing, with x and f substituted by mtcars_split and map_dbl(mean), respectively:

mtcars_split %>% map(~ .x %>% map_dbl(mean))
# or, equivalently:
mtcars_split %>% map(~ map_dbl(.x, mean))

And this can be simplified the same way as our example above:

mtcars_split %>% map(map_dbl, mean)

R: Passing different-lengthed inputs to purrr with nested data structures

Consider using a nested map. Loop over the 'bar' list, then do the loop over the 'foo' and paste. This will return a nested list as in the OP's expected

library(purrr)
out2 <- map(bar, ~ map(foo, function(y) paste0(.x, '_', y)))
identical(out, out2)
#[1] TRUE

Equivalent option in base R is

lapply(bar, function(x) lapply(foo, function(y) paste0(x, '_', y)))

Or with base R, we could use outer, to create a matrix of strings, then split by row (asplit with MARGIN as 1), into a list of vectors, loop over the list and convert each element of the vector to a list element with as.list

out3 <- lapply(asplit(outer(bar, paste0('_', foo), FUN = paste0), 1), as.list)
identical(out, out3)
#[1] TRUE

Nested loop with purrr::walk leaves empty plots

Just wrapping your plots in a print statement solves the issue

walk(mods, function(.x) {
print(wrap_plots(plotlist = exdata %>% filter(cut == .x) %>% pull(plots), ncol = colsn, nrow = rowsn) + plot_annotation(title = " "))
walk2(seq(0.165,0.835, length.out = colsn), unique(exdata$dummy), ~ grid.draw(textGrob(.y, x = .x, y = 0.98, rot = 0)))
})

Sample Image

How to make nested purrr map to extract rows based on dynamic variables instead of nested loop?

Since you mention other alternatives are also welcomed, consider base R. Several issues derive from your initial (non-purr) setup:

  1. One of the biggest issue of original code is using rbind inside a loop which leads to excessive copying in memory as explained in this SO thread, Replace rbind in for-loop with lapply? (2nd circle of hell) and Patrick Burn's R Internal - Circle 2: Growing Objects. To resolve, build a list of data frames that is appended outside of loop.

  2. The repeated use of scoping assignment, <<-, to affect the global environment from inside a local function appears to be unneeded, especially since temp objects are replaced with each loop so only last iteration will maintain. Often this operator is discouraged as it becomes tough to debug since global variables are adjusted. Functions are best handled when one object is returned.

  3. You initialize an empty data frame, df.exp before calling calc() but overwrite it inside the loop with <<-. Usually, after assigning an empty matrix or data frame, one assigns by rows inside loop but this is not done.

  4. Looping through unique() values can be replaced with by() or split() which also avoids using dplyr::filter() inside function. By the way, there are performance challenges of using pipes, %>% inside loops.

  5. Rather than for loop, use the apply family to build a list of objects after iteration such as lapply which avoids the bookkeeping of for loops which needs to initialize an empty list and assign elements to it (though there is nothing wrong with doing this approach). Also, in this way you avoid use of <<- within function.

Base R (using by, lapply, and do.call)

calc <- function(sub) {

## Extract records by "mid" excluding the first records
temp <- sub[2:nrow(temp),]

## Extract row number of "aprps==4"
r.aprps <- which(temp$aprps==4)

## Store exp dataframes in list
subdf_list <- lapply(1:length(r.aprps), function(j) {

## Extract movement by two pairs of rows based on "r.aprps"
temp2 <- temp[c((r.aprps[j]-1):r.aprps[j]),]

## Other operations in actual data set (just put example)
exp <- data.frame(mid=unique(temp2$mid), expsum=sum(temp2$exph))

return(exp)
})

df.exp <- do.call(rbind, subdf_list)
return(df.exp)
}

## subset by mid and pass subsets to calc()
df_list <- by(df, df$mid, calc)

## append all in final object
final_df <- do.call(rbind, df_list)

Because base::rbind.data.frame has some disadvantages, consider third-party packages as replacement of do.call(rbind, ...) such as dplyr::bind_rows() and data.table::rbindlist().

df.exp  <- dplyr::bind_rows(subdf_list) 
...
final_df <- dplyr::bind_rows(df_list)


df.exp <- data.table::rbindlist(subdf_list)
...
final_df <- data.table::rbindlist(df_list)

Use purrr::map*() to assign a nested list element to another nested list

Most of the logic is unrelated to ‘purrr’ functionality but of course you can use map2 to apply the logic to all pairwise items in the two lists:

listB_mod = map2(
listA, listB,
~ list(labels = .x$names[match(.x$names_num, .y$order)], order = .y$order)
)


Related Topics



Leave a reply



Submit