lapply-ing with the $ function
This is documented in ?lapply
, in the "Note" section (emphasis mine):
For historical reasons, the calls created by
lapply
are unevaluated,
and code has been written (e.g.bquote
) that relies on this. This
means that the recorded call is always of the formFUN(X[[0L]],
, with
...)0L
replaced by the current integer index. This is not
normally a problem, but it can be ifFUN
usessys.call
or
match.call
or if it is a primitive function that makes use of the
call. This means that it is often safer to call primitive functions
with a wrapper, so that e.g.lapply(ll, function(x) is.numeric(x))
is required in R 2.7.1 to ensure that method dispatch foris.numeric
occurs correctly.
Modify the code (lapply function) in R
[ 1]
modelList <- lapply(mtcars[-c(4,9)], function(x) aov(x ~ hp*am, data=mtcars) )
[2]
df2 <- plyr::ldply(modelList, function(x) summary(x)[[1]][["Pr(>F)"]])
names(df2) <- c(attr(modelList[[1]]$terms, "term.labels"), "residuals")
[3]
res.list <- lapply(modelList, '[[', "residuals")
par(mfrow=c(5,2), oma=c(0,0,0,0))
lapply(res.list, hist)
Multiplying elements of list with lapply is almost twice as fast with in-line function definition than with standard *
lapply
calls match.fun
, which must spend some time (well, about a microsecond) matching the string "*"
to the primitive function `*`
. Passing the function directly avoids the overhead.
l <- list(1, 2, 3)
microbenchmark::microbenchmark(lapply(l, function(x) x * 1000),
lapply(l, "*", 1000),
lapply(l, `*`, 1000),
times = 1e+06L)
## Unit: nanoseconds
## expr min lq mean median uq max neval
## lapply(l, function(x) x * 1000) 1271 1435 1614.497 1476 1517 1243981 1e+06
## lapply(l, "*", 1000) 1640 1763 2026.791 1804 1886 16498605 1e+06
## lapply(l, `*`, 1000) 861 984 1198.956 1025 1066 16636365 1e+06
microbenchmark::microbenchmark(match.fun(function(x) x * 1000),
match.fun("*"),
match.fun(`*`),
times = 1e+06L)
## Unit: nanoseconds
## expr min lq mean median uq max neval
## match.fun(function(x) x * 1000) 82 164 249.0617 205 205 15783606 1e+06
## match.fun("*") 779 902 1036.1593 902 984 15515261 1e+06
## match.fun(`*`) 41 164 187.4243 164 164 588842 1e+06
That said, match.fun
is never going to be a bottleneck, unless maybe you've written a function that calls match.fun
a few billion times, so optimizing at this level would just be "for fun".
Lapplying a function over two lists of dataframes in R
Here, we could use Map
from base R
to apply the function on the corresponding elements of both the list
s
out <- Map(my_function, list_A, list_B)
lapply
can also be used, if we loop over the sequence of one of the list
out <- lapply(seq_along(list_A), function(i)
my_function(list_A[[i]], list_B[[i]]))
which is similar to using a for
loop
out <- vector('list', length(list_A))
for(i in seq_along(list_A)) out[[i]] <- my_function(list_A[[i]], list_B[[i]])
Using lapply with if to test each element in a list
It pains me to answer this because it's very un R to do this. You could try being more explicit and use brackets as in:
lapply(alist, function(x) if (x > 7) {1} else {0})
Or the vectorized ifelse
lapply(alist, function(x) ifelse(x > 7, 1, 0))
Or best of all:
as.numeric(alist > 7)
Related Topics
List for Multiple Plots from Loop (Ggplot2) - List Elements Being Overwritten
Reading Multiple Files into Multiple Data Frames
Why Doesn't Outer Work the Way I Think It Should (In R)
How to Align the Bars of a Histogram with the X Axis
Reasons That Ggplot2 Legend Does Not Appear
Split the Title Onto Multiple Lines
Convert a Date Vector into Julian Day in R
Cartesian Product with Dplyr R
How to Use Tidyr::Separate When the Number of Needed Variables Is Unknown
Date Format in Tooltip of Ggplotly
Loop in R: How to Save the Outputs
Dplyr Broadcasting Single Value Per Group in Mutate
Moving Color Key in R Heatmap.2 (Function of Gplots Package)