Rmpi: Cannot Use Mpi_Comm_Spawn API

Rmpi: cannot use MPI_Comm_spawn API

You need to use MPICH2 for spawn support. If you have MPICH2 installed, you may still need to specify --with-Rmpi-type=MPICH2 when installing Rmpi. If you used --with-Rmpi-type=MPICH instead, it would disable functions such as mpi.spawn.Rslaves.

Also note that MPICH2 apparently does not support spawning workers unless the program is launched using a command such as mpiexec. This basically means that you can't execute mpi.spawn.Rslaves from an interactive R session using MPICH2, although this is possible using Open MPI. To be clear, this is not the issue that you're reporting, but you may encounter this after you have correctly installed Rmpi using MPICH2.

I was able to install Rmpi 0.6-5 using MPICH 3.1.3 with the command:

$ R CMD INSTALL Rmpi_0.6-5.tar.gz --configure-args='--with-mpi=$HOME/mpich-install --with-Rmpi-type=MPICH2'

To debug a configuration problem, you should install Rmpi from a directory rather than a tar file. That will allow you to examine the "config.log" file afterwards which will provide important information. Here is how I did that on my Linux box:

$ tar xzvf Rmpi_0.6-5.tar.gz 
$ R CMD INSTALL Rmpi --configure-args='--with-mpi=$HOME/mpich-install --with-Rmpi-type=MPICH2'

In order to get spawn support, the MPI2 macro needs to be defined when compiling the C code in Rmpi. You can check if that is happening by searching for "PKG_CPPFLAGS" in config.log:

$ grep PKG_CPPFLAGS Rmpi/config.log
PKG_CPPFLAGS='-I/home/steve/mpich-install/include -DMPI2 -DMPICH2'

I have found "config.log" to be very useful for debugging configuration and build problems.

Note that you can use Rmpi without spawn support. You'll need to start all of the workers using mpirun (or mpiexec, etc) and it will be much more difficult, if not impossible, to use functions such as mpi.apply, mpi.applyLB, etc. But if you just need to initialize MPI so you can use MPI from functions implemented in C or Fortran, you will probably need to start all of the workers via mpirun.

R - Error in Rmpi with snow

When calling makeCluster to create an MPI cluster, the spec argument should either be a number or missing, depending on whether you want the workers to be spawned or not. You can't specify the hostnames, as you would when creating a SOCK cluster. And in order to start workers on other machines with an MPI cluster, you have to execute your R script using a command such as mpirun, mpiexec, etc., depending on your MPI installation, and you specify the hosts to use via arguments to mpirun, not to makeCluster.

In your case, you might execute your script with:

$ mpirun -n 1 -H ip3,localhost,ip1,ip2 R --slave -f script.R

Since -n 1 is used, your script executes only on "ip3", not all four hosts, but MPI knows about the other three hosts, and will be able to spawn processes to them.

You would create the MPI cluster in that script with:

cl <- makeCluster(3)

This should cause a worker to be spawned on "localhost", "ip1", and "ip2", with the master process running on "ip3" (at least with Open MPI: I'm not sure about other MPI distributions). I don't believe the "master" option is used with the MPI transport: it's primarily used by the SOCK transport.

You can get lots of information about mpirun from its man page.

Error running Rmpi when doing parallel computing

The problem is that makeCluster(nCores) is used by more than one package. As such, I use parallel::makeCluster(nCores) to solve the issue.

Number of slaves 0 when I mpirun my R code that test rmpi

Install the package pbdMPI in an R session on the login node and run the following translation of the Rmpi test code into the use of pbdMPI:

library(pbdMPI)

ns <- comm.size()

# Tell all R sessions to return a message identifying themselves
id <- comm.rank()
ns <- comm.size()
host <- system("hostname", intern = TRUE)
comm.cat("I am", id, "on", host, "of", ns, "\n", all.rank = TRUE)

# Test computations
x <- 5
x <- rnorm(x)
comm.print(length(x))
comm.print(x, all.rank = TRUE)

finalize()

You run it the same way you used for the Rmpi version: mpirun -np 4 Rscript your_new_script_file.

Spawning MPI (as in the Rmpi example) was appropriate when running on clusters of workstations but on an HPC cluster the prevalent way to program with MPI is SPMD - single program multiple data. SPMD means that your code is a generalization of a serial code that is able to have several copies of itself cooperate with each other.

In the above example, cooperation happens only with printing (the comm... functions). There is no manager/master, just several R sessions running the same code (usually computing something different based on comm.rank()) and cooperating/communicating via MPI. This is the prevalent way of large scale parallel computing on HPC clusters.

Call parallel fortran MPI subroutine from R

Here is a simple Fortran/MPI subroutine that I want to call from R:

subroutine test(id, ierr)
use mpi
implicit none
integer*4 id, ierr
call MPI_Comm_rank(MPI_COMM_WORLD, id, ierr)
end subroutine test

To call this from R on a Linux machine, I built a shared object file using the Open MPI wrapper command "mpif90":

$ mpif90 -fpic -shared -o test.so test.f90

I tried to use "R CMD SHLIB", but eventually decided that it was easier to get "mpif90" to create a shared object than to get "R CMD SHLIB" to deal with MPI. The downside is that the command is gfortran specific. For a different compiler, you might get some help by using the "SHLIB" --dry-run option:

$ R CMD SHLIB --dry-run test.f90

This will display the commands that it would have used to create the shared object using your compiler. You can then modify the commands to use "mpif90" in order to handle the MPI headers and libraries.

Here is an R script that calls the Fortran test subroutine. It loads Rmpi (which automatically calls MPI_Init), loads the shared object containing my Fortran subroutine, and then calls it:

# SPMD-style program: start all workers via mpirun
library(Rmpi)
dyn.load("test.so")

# This Fortran subroutine will use MPI functions
r <- .Fortran("test", as.integer(0), as.integer(0))

# Each worker displays the results
id <- r[[1]]
ierr <- r[[2]]
if (ierr == 0) {
cat(sprintf("worker %d: hello\n", id))
} else {
cat(sprintf("ierr = %d\n", ierr))
}

# Finalize MPI and quit
mpi.quit()

Since it's an SPMD-style program, it doesn't spawn workers, like many Rmpi examples. Instead, all of the workers are started via mpirun, which is the typical way of executing C and Fortran MPI programs:

$ mpirun -n 3 R --slave -f test.R

This runs three instances of my R script, so the output is:

worker 0: hello
worker 1: hello
worker 2: hello

I think that structuring the code in this way makes it easy to use MPI from R and from any number of Fortran subroutines.



Related Topics



Leave a reply



Submit