How to show the progress of code in parallel computation in R?
The doSNOW package has support for progress bars, while doParallel does not. Here's a way to put a progress bar in your example:
require(doSNOW)
require(Kendall)
cores <- parallel::detectCores()
cl <- makeSOCKcluster(cores)
registerDoSNOW(cl)
mydata <- matrix(rnorm(8000*500), ncol=500)
pb <- txtProgressBar(min=1, max=8000, style=3)
progress <- function(n) setTxtProgressBar(pb, n)
opts <- list(progress=progress)
result <-
foreach(i=1:8000, .packages="Kendall", .options.snow=opts,
.combine='rbind') %dopar% {
abc <- MannKendall(mydata[i,])
data.frame(tau=abc$tau, sl=abc$sl)
}
close(pb)
stopCluster(cl)
Using standard R shiny progress bar in parallel foreach calculations
The doParallel package is an extension of the parallel package as shown in the documentation here.
https://cran.r-project.org/web/packages/doParallel/doParallel.pdf
Reading the parallel package's documentation we see that it implements 3 different methods to achieve parallelism. Keep in mind R is a single threaded language.
- A new R session where the parent process communicates with a worker or child process.
- Via Forking
- Using OS level facilities
You can find this information here,
https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf
A consequence of this is that the child process cannot communicate with the parent process until it completes its computation and returns a value. This is to the best of my knowledge.
Hence, ticking the progress bar within the worker process will not be possible.
Full disclosure, I have not worked with the doParallel package and the documentation with respect to shiny was limited.
Alternative solution
There is a similar package however with extensive documentation with respect to shiny. These are the futures
and promises
and ipc
packages. futures
and promises
enable asynchronous programming while ipc
enables interprocess communication. To help us even more it also has an AsyncProgress()
function.
Here is an example where we tick two counters synchronously.
Example
library(shiny)
library(future)
library(promises)
library(ipc)
plan(multisession)
ui <- fluidPage(
actionButton(inputId = "go", label = "Launch calculation")
)
server <- function(input, output, session) {
observeEvent(input$go, {
progress = AsyncProgress$new(message="Complex analysis")
future({
for (i in 1:15) {
progress$inc(1/15)
Sys.sleep(0.5)
}
progress$close()
return(i)
})%...>%
cat(.,"\n")
Sys.sleep(1)
progress2 = AsyncProgress$new(message="Complex analysis")
future({
for (i in 1:5) {
progress2$inc(1/5)
Sys.sleep(0.5)
}
progress2$close()
return(i)
})%...>%
cat(.,"\n")
NULL
})
}
shinyApp(ui = ui, server = server)
Your code adapted
Here is the code you have written, slightly modified to spin off many asynchronous processes. Any work can be performed in the worker, such as the vector you create and add an rnorm
too. (Not shown here)
library(shiny)
library(future)
library(promises)
library(ipc)
plan(multisession)
ui <- fluidPage(
actionButton(inputId = "go", label = "Launch calculation")
)
server <- function(input, output, session) {
observeEvent(input$go, {
Runs=c(1:4) #define the number of runs
progress = list() #A list to maintain progress for each run
for(j in Runs){
progress[[j]] = AsyncProgress$new(message="Complex analysis")
future({
for (i in 1:10) {
progress[[j]]$inc(1/10)
Sys.sleep(0.2)
}
progress[[j]]$close()
return(i)
})%...>%
cat(.,'\n')
}
NULL
})
}
shinyApp(ui = ui, server = server)
The code above is a modified version of the code found in the ipc documentation here:
http://htmlpreview.github.io/?https://github.com/fellstat/ipc/blob/master/inst/doc/shinymp.html
Additional Resources:
https://rstudio.github.io/promises/articles/overview.html
R: asynchronous parallel lapply
(Disclaimer: I'm the author of the future framework and the progressr package)
A close solution that resembles base::lapply()
, and your pbapply::pblapply()
example, is to use the future.apply as:
library(future.apply)
## The below is same as plan(multisession, workers=4)
cl <- parallel::makeCluster(4)
plan(cluster, workers=cl)
xs <- 1:100
results <- future_lapply(xs, FUN=function(x) {
Sys.sleep(0.1)
sqrt(x)
})
Chunking:
You can control the amount of chunking with argument future.chunk.size
or supplementary future.schedule
. To disable chunking such that each element is processed in a unique parallel task, use future.chunk.size=1
. This way, if there is one element that takes much longer than other elements, it will not hold up any other elements.
xs <- 1:100
results <- future_lapply(xs, FUN=function(x) {
Sys.sleep(0.1)
sqrt(x)
}, future.chunk.size=1)
Progress updates in parallel:
If you want to receive progress updates when doing parallel processing, you can use progressr package and configure it to use the progress package to report updates as a progress bar (here also with an ETA).
library(future.apply)
plan(multisession, workers=4)
library(progressr)
handlers(handler_progress(format="[:bar] :percent :eta :message"))
with_progress({
p <- progressor(along=xs)
results <- future_lapply(xs, FUN=function(x) {
p() ## signal progress
Sys.sleep(0.1)
sqrt(x)
}, future.chunk.size=1)
})
You can wrap this into a function, e.g.
my_fcn <- function(xs) {
p <- progressor(along=xs)
future_lapply(xs, FUN=function(x) {
p()
Sys.sleep(0.1)
sqrt(x)
}, future.chunk.size=1)
}
This way you can call it as a regular function:
> result <- my_fcn(xs)
and use plan()
to control exactly how you want it to parallelize. This will not report on progress. To do that, you'll have to do:
> with_progress(result <- my_fcn(xs))
[====>-----------------------------------------------------] 9% 1m
Run everything in the background: If your question was how to run the whole shebang in the background, see the 'Future Topologies' vignette. That's another level of parallelization but it's possible.
how to track progress in mclapply in R in parallel package
This is an update of my related answer.
library(parallel)
finalResult <- local({
f <- fifo(tempfile(), open="w+b", blocking=T)
if (inherits(parallel:::mcfork(), "masterProcess")) {
# Child
progress <- 0.0
while (progress < 1 && !isIncomplete(f)) {
msg <- readBin(f, "double")
progress <- progress + as.numeric(msg)
cat(sprintf("Progress: %.2f%%\n", progress * 100))
}
parallel:::mcexit()
}
numJobs <- 100
result <- mclapply(1:numJobs, function(...) {
# Do something fancy here... For this example, just sleep
Sys.sleep(0.05)
# Send progress update
writeBin(1/numJobs, f)
# Some arbitrary result
sample(1000, 1)
})
close(f)
result
})
cat("Done\n")
parSapply and progress bar
The parSapply
function doesn't support a progress bar, and I don't think there is any really good way to implement one by adding extra code to the task function, although people have made valiant efforts to do that.
The doSNOW
package supports progress bars, so you could either use that directly or write a wrapper function that works like the parSapply
function. Here's one way to write such a wrapper function:
# This function is similar to "parSapply", but doesn't preschedule
# tasks and doesn't support "simplify" and "USE.NAMES" options
pbSapply <- function(cl, X, FUN, ...) {
registerDoSNOW(cl)
pb <- txtProgressBar(max=length(X))
on.exit(close(pb))
progress <- function(n) setTxtProgressBar(pb, n)
opts <- list(progress=progress)
foreach(i=X, .combine='c', .options.snow=opts) %dopar% {
FUN(i, ...)
}
}
You can easily modify this function to use either the tkProgressBar
or winProgressBar
function.
Here's an example use of pbSapply
:
library(doSNOW)
cl <- makeSOCKcluster(3)
x <- pbSapply(cl, 1:100, function(i, j) {Sys.sleep(1); i + j}, 100)
Note that this doesn't use prescheduling, so the performance won't be as good as parSapply
if you have small tasks.
Related Topics
How to Plot the Relative Proportions of Two Groups Using a Fill Aesthetic in Ggplot2
R Xts: Generating 1 Minute Time Series from Second Events
How to Get the Zoom Level from the Leaflet Map in R/Shiny
Generate Observers for Dynamic Number of Inputs
Removing Rows in R Based on Values in a Single Column
How to Rotate an Image R Raster
Weighted Pearson's Correlation
R-Shiny Using Reactive Renderui Value
About Gforce in Data.Table 1.9.2
Dynamic Arguments to Expand.Grid
Add a Page Refresh Button by Using R Shiny
What's My User Agent When I Parse Website with Rvest Package in R
How to Have Conditional Formatting of Data Frames in R Shiny
Getting Frequency Values from Histogram in R
How to Set Unique Row and Column Names of a Matrix When Its Dimension Is Unknown