purrr map equivalent of nested for loop
As @r2evans points out, the .x
from your first call is masked. however you can create a lambda function that takes 2 parameters .x
and .y
, and assign the previous .x
to the new .y
through the ...
argument.
I'll use walk
rather than map
as in this case you're only interested in side effects (printing)
walk(1:4,~ walk(1:6, ~ print(paste(.x, .y, sep = "-")),.y=.x))
Another option is to use expand.grid
to lay out the combinations, and then iterate on those with pwalk
(or pmap
in other circumstances)
purrr::pwalk(expand.grid(1:4,1:6),~print(paste(.x, .y, sep = "-")))
Output in both cases:
[1] "1-1"
[1] "2-1"
[1] "3-1"
[1] "4-1"
[1] "5-1"
[1] "6-1"
[1] "1-2"
[1] "2-2"
[1] "3-2"
[1] "4-2"
[1] "5-2"
[1] "6-2"
[1] "1-3"
[1] "2-3"
[1] "3-3"
[1] "4-3"
[1] "5-3"
[1] "6-3"
[1] "1-4"
[1] "2-4"
[1] "3-4"
[1] "4-4"
[1] "5-4"
[1] "6-4"
Nested looping with map from purrr package in R
I would use an anonymous function call here to avoid the confusion and be clear of what is what.
map(0:1, function(x)
map_df(0:7,function(y)
data.frame(summary(emtrends(fit,
~ses, var= "time", max.degree= 2,at =list(ses=x,time=y))))[1,]))
#[[1]]
# ses degree time.trend SE df lower.CL upper.CL
#1 0 linear 0.34156633 0.06084479 1594 0.22222212 0.4609105
#2 0 linear 0.29736820 0.04502015 1594 0.20906326 0.3856731
#3 0 linear 0.25317006 0.03015484 1594 0.19402275 0.3123174
#4 0 linear 0.20897192 0.01869728 1594 0.17229808 0.2456458
#5 0 linear 0.16477378 0.01864500 1594 0.12820248 0.2013451
#6 0 linear 0.12057564 0.03005757 1594 0.06161911 0.1795322
#7 0 linear 0.07637750 0.04491162 1594 -0.01171454 0.1644695
#8 0 linear 0.03217936 0.06073239 1594 -0.08694438 0.1513031
#[[2]]
# ses degree time.trend SE df lower.CL upper.CL
#1 1 linear 0.43528003 0.08297255 1594 0.27253325 0.5980268
#2 1 linear 0.38631190 0.06139288 1594 0.26589262 0.5067312
#3 1 linear 0.33734376 0.04112141 1594 0.25668603 0.4180015
#4 1 linear 0.28837563 0.02549702 1594 0.23836441 0.3383868
#5 1 linear 0.23940750 0.02542573 1594 0.18953612 0.2892789
#6 1 linear 0.19043936 0.04098878 1594 0.11004179 0.2708369
#7 1 linear 0.14147123 0.06124487 1594 0.02134227 0.2616002
#8 1 linear 0.09250309 0.08281927 1594 -0.06994303 0.2549492
How to convert this nested for loop into a purrr function
For the first purrr call, give the frame_list, and for the second one, give it the names of the data table. I added a blah column to dataframe y, also changed
library(dplyr)
x <- data.frame(first = c(1:3, 6:7),
second = c(1:3, 6:7),
third = c(1:5))
y <- data.frame(differ = c(10:13),
other = c(10:13),
nomatch = rep(5),
blah=rep(5),
another = c(10:13),
argh = c(9:12))
z <- data.frame(nothing = c(20:27),
nope = c(20:27),
noway = c(16,17, 18, 19, 5, 21, 22, 23),
still = c(21:28),
again = c(22:29))
frame_list <- list(first = x,
second = y,
third = z)
check_against <- rep(5, 5)
col_name_list=purrr::map(frame_list, function(w) {
m=purrr::map_lgl(w, function(x) {
return(any(unlist(x) %in% check_against))
})
names(w)[m]
})
col_name_list
output
$first
[1] "third"
$second
[1] "nomatch" "blah"
$third
[1] "noway"
nested map functions with purrr
We can use another loop nested in map
to run for different values of "K"
library(tidyverse)
rerun(5, gen_boot_sample(train_data)) %>%
map(~ {
# create the subset datasets
train_data <- .x %>%
select_at(vars(contains('x')))
train_label <- .x %>%
select_at(.vars = vars(contains("y"))) %>%
pull()
test_data <- test_data %>%
select_at(.vars = vars(contains("x")))
# loop over different values for 'K'
map_dbl(1:10, ~ {
#apply the knn function
out <- knn(train_data, train_label, test_data, K = .x)
sum(out - test_label)^2}
)
})
Nested loop on dates using purrr map
If you're looking for a tidyverse
solution, I would suggest you this one:
full_dosing <- dosing %>%
mutate(labdat = dosdat) %>%
group_by(ID) %>%
complete(labdat = seq(min(labdat), max(labdat), "day"), ID) %>%
fill(dosdat, dosrec, doslev) %>%
ungroup()
left_join(labs, full_dosing, by = c("ID", "labdat"))
ID labrec labval labdat dosrec doslev dosdat
1 1 1 4.92 2020-06-17 2 0.1 2020-06-15
2 1 2 2.89 2020-06-24 3 0.1 2020-06-22
3 1 3 14.01 2020-07-08 4 0.9 2020-07-07
4 2 1 3.92 2020-06-06 1 0.2 2020-06-05
5 2 2 17.58 2020-06-26 3 0.3 2020-06-24
However, it is less efficient than the data.table
solution because you need to complete
the dosing
dataframe first.
The solution is based on this data:
#' Dosing file
#' ----------------------------------
dosdatID1<-c("2020-06-06", "2020-06-15", "2020-06-22", "2020-07-07", "2020-07-17")
dosdatID2<-c("2020-06-05", "2020-06-08", "2020-06-24", "2020-06-27")
dosing<-data.frame(
ID=c(rep(1, 5), rep(2, 4)),
dosrec=c(1:5, 1:4),
doslev=c(c(0.1, 0.1, 0.1, 0.9, 0.9), c(0.2, 0.2, 0.3, 0.3)),
dosdat=as.Date(c(dosdatID1, dosdatID2)))
#' Lab values file
#' ----------------------------------
labdatID1<-c("2020-06-17", "2020-06-24", "2020-07-08")
labdatID2<-c("2020-06-06", "2020-06-26")
labs<-data.frame(
ID=c(rep(1, 3), rep(2, 2)),
labrec=c(1:3, 1:2),
labval=round(c(rnorm(3, 10, 5), rnorm(2, 15, 5)), 2),
labdat=as.Date(c(labdatID1, labdatID2))
)
Map function to second level of nested list using purrr
The recipe for this kind of problem is always the same:
Decompose the problem, solve it for an individual case, and then put it back together inside out.
As you observed, mtcars %>% split(.$cyl)
gives you a list of lists (list of data.frames). You want to map mean
over the inner lists.
So let’s do it for one list first:
mtcars_split[[1]] %>% map_dbl(mean)
# Or, equivalently:
map_dbl(mtcars_split[[1]], mean)
This works. We’ve decomposed the problem and successfully solved it for an individual case: In other words, given a list x
and a transformation f
, we’ve solved the problem for x[[1]]
by executing f(x[[1]])
(which is equivalent to x[[1]] %>% f()
).
Time to generalise it to all cases. And we already know how to generalise a transformation of an element x[[1]]
to a whole list x
: use map
on that list:
x %>% map(~ .x %>% f())
# or, equivalently:
x %>% map(~ f(.x))
# or, equivalently:
map(x, ~ f(.x))
# or, finally:
map(x, f)
Let’s do the exact same thing, with x
and f
substituted by mtcars_split
and map_dbl(mean)
, respectively:
mtcars_split %>% map(~ .x %>% map_dbl(mean))
# or, equivalently:
mtcars_split %>% map(~ map_dbl(.x, mean))
And this can be simplified the same way as our example above:
mtcars_split %>% map(map_dbl, mean)
R: Passing different-lengthed inputs to purrr with nested data structures
Consider using a nested map
. Loop over the 'bar' list
, then do the loop over the 'foo' and paste
. This will return a nested list
as in the OP's expected
library(purrr)
out2 <- map(bar, ~ map(foo, function(y) paste0(.x, '_', y)))
identical(out, out2)
#[1] TRUE
Equivalent option in base R
is
lapply(bar, function(x) lapply(foo, function(y) paste0(x, '_', y)))
Or with base R
, we could use outer
, to create a matrix
of strings, then split by row (asplit
with MARGIN
as 1), into a list
of vector
s, loop over the list
and convert each element of the vector
to a list
element with as.list
out3 <- lapply(asplit(outer(bar, paste0('_', foo), FUN = paste0), 1), as.list)
identical(out, out3)
#[1] TRUE
Nested loop with purrr::walk leaves empty plots
Just wrapping your plots in a print statement solves the issue
walk(mods, function(.x) {
print(wrap_plots(plotlist = exdata %>% filter(cut == .x) %>% pull(plots), ncol = colsn, nrow = rowsn) + plot_annotation(title = " "))
walk2(seq(0.165,0.835, length.out = colsn), unique(exdata$dummy), ~ grid.draw(textGrob(.y, x = .x, y = 0.98, rot = 0)))
})
How to make nested purrr map to extract rows based on dynamic variables instead of nested loop?
Since you mention other alternatives are also welcomed, consider base R. Several issues derive from your initial (non-purr) setup:
One of the biggest issue of original code is using
rbind
inside a loop which leads to excessive copying in memory as explained in this SO thread, Replace rbind in for-loop with lapply? (2nd circle of hell) and Patrick Burn's R Internal - Circle 2: Growing Objects. To resolve, build a list of data frames that is appended outside of loop.The repeated use of scoping assignment,
<<-
, to affect the global environment from inside a local function appears to be unneeded, especially since temp objects are replaced with each loop so only last iteration will maintain. Often this operator is discouraged as it becomes tough to debug since global variables are adjusted. Functions are best handled when one object is returned.You initialize an empty data frame,
df.exp
before callingcalc()
but overwrite it inside the loop with<<-
. Usually, after assigning an empty matrix or data frame, one assigns by rows inside loop but this is not done.Looping through
unique()
values can be replaced withby()
orsplit()
which also avoids usingdplyr::filter()
inside function. By the way, there are performance challenges of using pipes,%>%
inside loops.Rather than
for
loop, use the apply family to build a list of objects after iteration such aslapply
which avoids the bookkeeping offor
loops which needs to initialize an empty list and assign elements to it (though there is nothing wrong with doing this approach). Also, in this way you avoid use of<<-
within function.
Base R (using by
, lapply
, and do.call
)
calc <- function(sub) {
## Extract records by "mid" excluding the first records
temp <- sub[2:nrow(temp),]
## Extract row number of "aprps==4"
r.aprps <- which(temp$aprps==4)
## Store exp dataframes in list
subdf_list <- lapply(1:length(r.aprps), function(j) {
## Extract movement by two pairs of rows based on "r.aprps"
temp2 <- temp[c((r.aprps[j]-1):r.aprps[j]),]
## Other operations in actual data set (just put example)
exp <- data.frame(mid=unique(temp2$mid), expsum=sum(temp2$exph))
return(exp)
})
df.exp <- do.call(rbind, subdf_list)
return(df.exp)
}
## subset by mid and pass subsets to calc()
df_list <- by(df, df$mid, calc)
## append all in final object
final_df <- do.call(rbind, df_list)
Because base::rbind.data.frame
has some disadvantages, consider third-party packages as replacement of do.call(rbind, ...)
such as dplyr::bind_rows()
and data.table::rbindlist()
.
df.exp <- dplyr::bind_rows(subdf_list)
...
final_df <- dplyr::bind_rows(df_list)
df.exp <- data.table::rbindlist(subdf_list)
...
final_df <- data.table::rbindlist(df_list)
Use purrr::map*() to assign a nested list element to another nested list
Most of the logic is unrelated to ‘purrr’ functionality but of course you can use map2
to apply the logic to all pairwise items in the two lists:
listB_mod = map2(
listA, listB,
~ list(labels = .x$names[match(.x$names_num, .y$order)], order = .y$order)
)
Related Topics
How to Match by Nearest Date from Two Data Frames
Is There a Logical Way to Think About List Indexing
R Knitr Markdown: Output Plots Within for Loop
Pattern Matching Using a Wildcard
How to Determine If Date Is a Weekend or Not (Not Using Lubridate)
R Ggplot2: Stat_Count() Must Not Be Used with a Y Aesthetic Error in Bar Graph
How to Combine 2 Plots (Ggplot) into One Plot
Checking If Date Is Between Two Dates in R
Display Exact Value of a Variable in R
How to Get Unsaved Script Tabs
Reading Multiple CSV Files from a Folder into a Single Dataframe in R
Difference Between If() and Ifelse() Functions
How to Define More Line Types for Graphs in R (Custom Linetype)