Understanding the Differences Between Mclapply and Parlapply in R

Understanding the differences between mclapply and parLapply in R

The beauty of mclapply is that the worker processes are all created as clones of the master right at the point that mclapply is called, so you don't have to worry about reproducing your environment on each of the cluster workers. Unfortunately, that isn't possible on Windows.

When using parLapply, you generally have to perform the following additional steps:

  • Create a PSOCK cluster
  • Register the cluster if desired
  • Load necessary packages on the cluster workers
  • Export necessary data and functions to the global environment of the cluster workers

Also, when you're done, it's good practice to shutdown the PSOCK cluster using stopCluster.

Here's a translation of your example to parLapply:

library(parallel)
cl <- makePSOCKcluster(4)
setDefaultCluster(cl)
adder <- function(a, b) a + b
clusterExport(NULL, c('adder'))
parLapply(NULL, 1:8, function(z) adder(z, 100))

If your adder function requires a package, you'll have to load that package on each of the workers before calling it with parLapply. You can do that quite easily with clusterEvalQ:

clusterEvalQ(NULL, library(MASS))

Note that the NULL first argument to clusterExport, clusterEval and parLapply indicates that they should use the cluster object registered via setDefaultCluster. That can be very useful if your program is using mclapply in many different functions, so that you don't have to pass the cluster object to every function that needs it when converting your program to use parLapply.

Of course, adder may call other functions in your global environment which call other functions, etc. In that case, you'll have to export them as well and load any packages that they need. Also note that if any variables that you've exported change during the course of your program, you will have to export them again in order to update them on the cluster workers. Again, that isn't necessary with mclapply because it always creates/clones/forks the workers whenever it is called, making that unnecessary.

mclapply vs parLapply speeds

Some quick benchmarks suggest that mclapply could be slightly faster, but this probably depends on the specific system and problem. The more balanced the jobs and the slower the actual tasks the less it should matter, which function you use.

library(parallel)
library(microbenchmark)

microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(1:7, 3), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(1:7 , 3), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)

#Unit: seconds
# expr min lq mean median uq max neval
#parLapply 1.85548 2.04397 3.332970 3.071284 4.323514 6.294364 10
#mclapply 1.62610 1.65288 2.217407 1.849594 2.243418 5.435189 10

microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(6, 20), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(6, 20), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)

#Unit: milliseconds
# expr min lq mean median uq max neval
#parLapply 1150.657 1188.9750 1705.1364 1242.739 2071.276 3785.516 10
# mclapply 820.692 932.2262 994.4404 1000.402 1079.930 1117.863 10

sessionInfo()
#R version 3.3.1 (2016-06-21)
#Platform: x86_64-pc-linux-gnu (64-bit)
#Running under: Ubuntu 14.04.5 LTS
#
#locale:
# [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
# [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] parallel stats graphics grDevices utils datasets methods base
#
#other attached packages:
#[1] microbenchmark_1.4-2.1 doParallel_1.0.10 iterators_1.0.8 foreach_1.4.3
#
#loaded via a namespace (and not attached):
# [1] colorspace_1.2-6 scales_0.4.0 plyr_1.8.4 tools_3.3.1 gtable_0.2.0 Rcpp_0.12.4
# [7] ggplot2_2.1.0 codetools_0.2-14 grid_3.3.1 munsell_0.4.3

Is mclapply() with mc.cores = 1 the same as lapply()?

The source code of parallel::mclapply contains this bit of code:

 ... 
if (cores < 2L)
return(lapply(X = X, FUN = FUN, ...))
...

So I believe the answer is yes, you should get the same results as using lapply directly, but there is also some additional overhead. I doubt that this will affect the runtime very significantly.

The documentation also states that:

Details

mclapply is a parallelized version of lapply, provided mc.cores > 1:
for mc.cores == 1 it simply calls lapply.

Versions of lapply() and mclapply() that avoid redundant processing

This actually seems to work:

lightly_parallelize_atomic <- function(X, FUN, jobs = 1, ...){
keys <- unique(X)
index <- match(X, keys)
values <- mclapply(X = keys, FUN = FUN, mc.cores = jobs, ...)
values[index]
}

And in my case, it's okay that X is atomic.

But it would be neat to find something already built into either a package or R natively.

R, the environment of mclapply and removing variables

You should call the gc function after removing the variable so that the memory associated with the object is freed by the garbage collector sooner rather than later. The rm function only removes the reference to the data, while the actual object may continue to exist until the garbage collector eventually runs.

You may also want to call gc before the first mclapply to make testing easier:

gc()
opt.Models = mclapply(1:100, mc.cores=20, function(i){
res = loadResult(reg, id=i)
return(post.Process(res))
})

# presumably do something with opt.Models...

rm(opt.Models)
gc() # free up memory before forking

opt.Models = mclapply(1:100, mc.cores=20, function(i){
res = loadResult(reg, id=i)
return(post.Process(res))
})


Related Topics



Leave a reply



Submit