What's the difference between lapply and do.call?
There is a function called Map
that may be similar to map in other languages:
lapply
returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.do.call
constructs and executes a function call from a name or a function and a list of arguments to be passed to it.Map
applies a function to the corresponding elements of given vectors...Map
is a simple wrapper tomapply
which does not attempt to simplify the result, similar to Common Lisp's mapcar (with arguments being recycled, however). Future versions may allow some control of the result type.
Map
is a wrapper aroundmapply
lapply
is a special case ofmapply
- Therefore
Map
andlapply
will be similar in many cases.
For example, here is lapply
:
lapply(iris, class)
$Sepal.Length
[1] "numeric"
$Sepal.Width
[1] "numeric"
$Petal.Length
[1] "numeric"
$Petal.Width
[1] "numeric"
$Species
[1] "factor"
And the same using Map
:
Map(class, iris)
$Sepal.Length
[1] "numeric"
$Sepal.Width
[1] "numeric"
$Petal.Length
[1] "numeric"
$Petal.Width
[1] "numeric"
$Species
[1] "factor"
do.call
takes a function as input and splatters its other arguments to the function. It is widely used, for example, to assemble lists into simpler structures (often with rbind
or cbind
).
For example:
x <- lapply(iris, class)
do.call(c, x)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"numeric" "numeric" "numeric" "numeric" "factor"
R: apply vs do.call
apply(DF, 1, f)
converts each row ofDF
to a vector and then passes that vector to f. IfDF
were a mix of strings and numbers then the row would be converted to a character vector before passing it tof
so that, for example,apply(iris, 1, function(x) sum(x[-5]))
will not work even though the rowiris[i, -5]
contains all numeric elements. The row is converted to character string and you can't sum character strings. On the other handapply(iris[-5], 1, sum)
will work the same asrowSums(iris[-5])
.if
f
produces a vector the result is a matrix and not another data frame; also, the result is the transpose of what you might expect. Thisapply(BOD, 1, identity)
gives the following rather than giving
BOD
back:[,1] [,2] [,3] [,4] [,5] [,6]
Time 1.0 2.0 3 4 5.0 7.0
demand 8.3 10.3 19 16 15.6 19.8Many years ago Hadley Wickham did post
iapply
which is idempotent in the sense thatiapply(mat, 1, identity)
returnsmat
, rather thant(mat)
, wheremat
is a matrix. More recently with his plyr package one can write:library(plyr)
ddplyr(BOD, 1, identity)and get
BOD
back as a data frame.
On the other hand apply(BOD, 1, sum)
will give the same result as rowSums(BOD)
and apply(BOD, 1, f)
might be useful for functions f
for which f
produces a scalar and there is no counterpart such as in the sum
/ rowSums
case. Also if f
produces a vector and you don't mind a matrix result you can transpose the output of apply
yourself and although ugly it would work.
R package: using lapply to do.call a list of internal functions
Following the comments on the question, I have a (partial) answer:
do.call
by default evaluates character arguments inparent.frame()
, and when called insidelapply
, this is a different environment than the one thetmp
functions are defined in (though I'm not sure about the specifics)- Supplying the current environment to
do.call
makes thelapply
approach work:lapply(funs, do.call, args = list(x = "test"), envir = environment())
What is the difference between call and apply?
The difference is that apply
lets you invoke the function with arguments
as an array; call
requires the parameters be listed explicitly. A useful mnemonic is "A for array and C for comma."
See MDN's documentation on apply and call.
Pseudo syntax:
theFunction.apply(valueForThis, arrayOfArgs)
theFunction.call(valueForThis, arg1, arg2, ...)
There is also, as of ES6, the possibility to spread
the array for use with the call
function, you can see the compatibilities here.
Sample code:
function theFunction(name, profession) {
console.log("My name is " + name + " and I am a " + profession +".");
}
theFunction("John", "fireman");
theFunction.apply(undefined, ["Susan", "school teacher"]);
theFunction.call(undefined, "Claude", "mathematician");
theFunction.call(undefined, ...["Matthew", "physicist"]); // used with the spread operator
Difference between `do.call()` a function and directly call a function in R?
Instead of as.name
you should use get
:
length(get(paste0("all_data_align_",year)))
You need to retrieve the object not just the name.
lapply and do.call running very slowly?
Since I'm in an evangelizing mood ... here's what the fast data.table
solution would look like:
library(data.table)
dt <- data.table(nuc, key="gene_id")
dt[,list(A=min(start),
B=max(end),
C=mean(pctAT),
D=mean(pctGC),
E=sum(length)), by=key(dt)]
# gene_id A B C D E
# 1: NM_032291 67000042 67108547 0.5582567 0.4417433 283
# 2: ZZZ 67000042 67108547 0.5582567 0.4417433 283
Related Topics
R: Assign Variable Labels of Data Frame Columns
Mean of a Column in a Data Frame, Given the Column's Name
Is There a Way of Manipulating Ggplot Scale Breaks and Labels
Pretty Ticks for Log Normal Scale Using Ggplot2 (Dynamic Not Manual)
How to Multiply Data Frame by Vector
Predict.Lm() with an Unknown Factor Level in Test Data
Reading 40 Gb CSV File into R Using Bigmemory
Count Number of Rows Matching a Criteria
Split Character Data into Numbers and Letters
Ggplot Centered Names on a Map
How to Stop Executing of R Code Inside Shiny (Without Stopping the Shiny Process)
Add a Row by Reference at the End of a Data.Table Object
If - Else If - Else Statement and Brackets
How to Add a Number of Observations Per Group and Use Group Mean in Ggplot2 Boxplot