What's the Difference Between Lapply and Do.Call

What's the difference between lapply and do.call?

There is a function called Map that may be similar to map in other languages:

  • lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

  • do.call constructs and executes a function call from a name or a function and a list of arguments to be passed to it.

  • Map applies a function to the corresponding elements of given vectors... Map is a simple wrapper to mapply which does not attempt to simplify the result, similar to Common Lisp's mapcar (with arguments being recycled, however). Future versions may allow some control of the result type.


  1. Map is a wrapper around mapply
  2. lapply is a special case of mapply
  3. Therefore Map and lapply will be similar in many cases.

For example, here is lapply:

lapply(iris, class)
$Sepal.Length
[1] "numeric"

$Sepal.Width
[1] "numeric"

$Petal.Length
[1] "numeric"

$Petal.Width
[1] "numeric"

$Species
[1] "factor"

And the same using Map:

Map(class, iris)
$Sepal.Length
[1] "numeric"

$Sepal.Width
[1] "numeric"

$Petal.Length
[1] "numeric"

$Petal.Width
[1] "numeric"

$Species
[1] "factor"

do.call takes a function as input and splatters its other arguments to the function. It is widely used, for example, to assemble lists into simpler structures (often with rbind or cbind).

For example:

x <- lapply(iris, class)
do.call(c, x)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"numeric" "numeric" "numeric" "numeric" "factor"

R: apply vs do.call


  • apply(DF, 1, f) converts each row of DF to a vector and then passes that vector to f. If DF were a mix of strings and numbers then the row would be converted to a character vector before passing it to f so that, for example, apply(iris, 1, function(x) sum(x[-5])) will not work even though the row iris[i, -5] contains all numeric elements. The row is converted to character string and you can't sum character strings. On the other hand apply(iris[-5], 1, sum) will work the same as rowSums(iris[-5]).

  • if f produces a vector the result is a matrix and not another data frame; also, the result is the transpose of what you might expect. This

    apply(BOD, 1, identity)

    gives the following rather than giving BOD back:

           [,1] [,2] [,3] [,4] [,5] [,6]
    Time 1.0 2.0 3 4 5.0 7.0
    demand 8.3 10.3 19 16 15.6 19.8

    Many years ago Hadley Wickham did post iapply which is idempotent in the sense that iapply(mat, 1, identity) returns mat, rather than t(mat), where mat is a matrix. More recently with his plyr package one can write:

    library(plyr)
    ddplyr(BOD, 1, identity)

    and get BOD back as a data frame.

On the other hand apply(BOD, 1, sum) will give the same result as rowSums(BOD) and apply(BOD, 1, f) might be useful for functions f for which f produces a scalar and there is no counterpart such as in the sum / rowSums case. Also if f produces a vector and you don't mind a matrix result you can transpose the output of apply yourself and although ugly it would work.

R package: using lapply to do.call a list of internal functions

Following the comments on the question, I have a (partial) answer:

  1. do.call by default evaluates character arguments in parent.frame(), and when called inside lapply, this is a different environment than the one the tmp functions are defined in (though I'm not sure about the specifics)
  2. Supplying the current environment to do.call makes the lapply approach work:
    lapply(funs, do.call, args = list(x = "test"), envir = environment())

What is the difference between call and apply?

The difference is that apply lets you invoke the function with arguments as an array; call requires the parameters be listed explicitly. A useful mnemonic is "A for array and C for comma."

See MDN's documentation on apply and call.

Pseudo syntax:

theFunction.apply(valueForThis, arrayOfArgs)

theFunction.call(valueForThis, arg1, arg2, ...)

There is also, as of ES6, the possibility to spread the array for use with the call function, you can see the compatibilities here.

Sample code:





function theFunction(name, profession) {
console.log("My name is " + name + " and I am a " + profession +".");
}
theFunction("John", "fireman");
theFunction.apply(undefined, ["Susan", "school teacher"]);
theFunction.call(undefined, "Claude", "mathematician");
theFunction.call(undefined, ...["Matthew", "physicist"]); // used with the spread operator

Difference between `do.call()` a function and directly call a function in R?

Instead of as.name you should use get:

length(get(paste0("all_data_align_",year)))

You need to retrieve the object not just the name.

lapply and do.call running very slowly?

Since I'm in an evangelizing mood ... here's what the fast data.table solution would look like:

library(data.table)
dt <- data.table(nuc, key="gene_id")

dt[,list(A=min(start),
B=max(end),
C=mean(pctAT),
D=mean(pctGC),
E=sum(length)), by=key(dt)]
# gene_id A B C D E
# 1: NM_032291 67000042 67108547 0.5582567 0.4417433 283
# 2: ZZZ 67000042 67108547 0.5582567 0.4417433 283


Related Topics



Leave a reply



Submit