Rmpi: cannot use MPI_Comm_spawn API
You need to use MPICH2 for spawn support. If you have MPICH2 installed, you may still need to specify --with-Rmpi-type=MPICH2
when installing Rmpi. If you used --with-Rmpi-type=MPICH
instead, it would disable functions such as mpi.spawn.Rslaves
.
Also note that MPICH2 apparently does not support spawning workers unless the program is launched using a command such as mpiexec. This basically means that you can't execute mpi.spawn.Rslaves
from an interactive R session using MPICH2, although this is possible using Open MPI. To be clear, this is not the issue that you're reporting, but you may encounter this after you have correctly installed Rmpi using MPICH2.
I was able to install Rmpi 0.6-5 using MPICH 3.1.3 with the command:
$ R CMD INSTALL Rmpi_0.6-5.tar.gz --configure-args='--with-mpi=$HOME/mpich-install --with-Rmpi-type=MPICH2'
To debug a configuration problem, you should install Rmpi from a directory rather than a tar file. That will allow you to examine the "config.log" file afterwards which will provide important information. Here is how I did that on my Linux box:
$ tar xzvf Rmpi_0.6-5.tar.gz
$ R CMD INSTALL Rmpi --configure-args='--with-mpi=$HOME/mpich-install --with-Rmpi-type=MPICH2'
In order to get spawn support, the MPI2
macro needs to be defined when compiling the C code in Rmpi. You can check if that is happening by searching for "PKG_CPPFLAGS" in config.log:
$ grep PKG_CPPFLAGS Rmpi/config.log
PKG_CPPFLAGS='-I/home/steve/mpich-install/include -DMPI2 -DMPICH2'
I have found "config.log" to be very useful for debugging configuration and build problems.
Note that you can use Rmpi without spawn support. You'll need to start all of the workers using mpirun (or mpiexec, etc) and it will be much more difficult, if not impossible, to use functions such as mpi.apply
, mpi.applyLB
, etc. But if you just need to initialize MPI so you can use MPI from functions implemented in C or Fortran, you will probably need to start all of the workers via mpirun.
R - Error in Rmpi with snow
When calling makeCluster
to create an MPI cluster, the spec
argument should either be a number or missing, depending on whether you want the workers to be spawned or not. You can't specify the hostnames, as you would when creating a SOCK cluster. And in order to start workers on other machines with an MPI cluster, you have to execute your R script using a command such as mpirun, mpiexec, etc., depending on your MPI installation, and you specify the hosts to use via arguments to mpirun, not to makeCluster
.
In your case, you might execute your script with:
$ mpirun -n 1 -H ip3,localhost,ip1,ip2 R --slave -f script.R
Since -n 1
is used, your script executes only on "ip3", not all four hosts, but MPI knows about the other three hosts, and will be able to spawn processes to them.
You would create the MPI cluster in that script with:
cl <- makeCluster(3)
This should cause a worker to be spawned on "localhost", "ip1", and "ip2", with the master process running on "ip3" (at least with Open MPI: I'm not sure about other MPI distributions). I don't believe the "master" option is used with the MPI transport: it's primarily used by the SOCK transport.
You can get lots of information about mpirun from its man page.
Error running Rmpi when doing parallel computing
The problem is that makeCluster(nCores)
is used by more than one package. As such, I use parallel::makeCluster(nCores)
to solve the issue.
Number of slaves 0 when I mpirun my R code that test rmpi
Install the package pbdMPI
in an R session on the login node and run the following translation of the Rmpi
test code into the use of pbdMPI
:
library(pbdMPI)
ns <- comm.size()
# Tell all R sessions to return a message identifying themselves
id <- comm.rank()
ns <- comm.size()
host <- system("hostname", intern = TRUE)
comm.cat("I am", id, "on", host, "of", ns, "\n", all.rank = TRUE)
# Test computations
x <- 5
x <- rnorm(x)
comm.print(length(x))
comm.print(x, all.rank = TRUE)
finalize()
You run it the same way you used for the Rmpi
version: mpirun -np 4 Rscript your_new_script_file
.
Spawning MPI (as in the Rmpi
example) was appropriate when running on clusters of workstations but on an HPC cluster the prevalent way to program with MPI is SPMD - single program multiple data. SPMD means that your code is a generalization of a serial code that is able to have several copies of itself cooperate with each other.
In the above example, cooperation happens only with printing (the comm...
functions). There is no manager/master, just several R sessions running the same code (usually computing something different based on comm.rank()
) and cooperating/communicating via MPI. This is the prevalent way of large scale parallel computing on HPC clusters.
Call parallel fortran MPI subroutine from R
Here is a simple Fortran/MPI subroutine that I want to call from R:
subroutine test(id, ierr)
use mpi
implicit none
integer*4 id, ierr
call MPI_Comm_rank(MPI_COMM_WORLD, id, ierr)
end subroutine test
To call this from R on a Linux machine, I built a shared object file using the Open MPI wrapper command "mpif90":
$ mpif90 -fpic -shared -o test.so test.f90
I tried to use "R CMD SHLIB", but eventually decided that it was easier to get "mpif90" to create a shared object than to get "R CMD SHLIB" to deal with MPI. The downside is that the command is gfortran specific. For a different compiler, you might get some help by using the "SHLIB" --dry-run
option:
$ R CMD SHLIB --dry-run test.f90
This will display the commands that it would have used to create the shared object using your compiler. You can then modify the commands to use "mpif90" in order to handle the MPI headers and libraries.
Here is an R script that calls the Fortran test
subroutine. It loads Rmpi
(which automatically calls MPI_Init
), loads the shared object containing my Fortran subroutine, and then calls it:
# SPMD-style program: start all workers via mpirun
library(Rmpi)
dyn.load("test.so")
# This Fortran subroutine will use MPI functions
r <- .Fortran("test", as.integer(0), as.integer(0))
# Each worker displays the results
id <- r[[1]]
ierr <- r[[2]]
if (ierr == 0) {
cat(sprintf("worker %d: hello\n", id))
} else {
cat(sprintf("ierr = %d\n", ierr))
}
# Finalize MPI and quit
mpi.quit()
Since it's an SPMD-style program, it doesn't spawn workers, like many Rmpi
examples. Instead, all of the workers are started via mpirun, which is the typical way of executing C and Fortran MPI programs:
$ mpirun -n 3 R --slave -f test.R
This runs three instances of my R script, so the output is:
worker 0: hello
worker 1: hello
worker 2: hello
I think that structuring the code in this way makes it easy to use MPI from R and from any number of Fortran subroutines.
Related Topics
Bluetooth Over Uart Using Hciattach
How to Stop Page Cache for Disk I/O in My Linux System
Removing First 3 Characters of File Names in Linux
Level Triggered Interrupt Handling and Nested Interrupts
How to Do an Initial Setup of Slapd Olc with Ldapmodify
Various Docker Container Paths Have Started Failing with Permission Errors on Linux Mint
How to Read a Value from User Input into a Variable
Truncate Table via Command Line in Linux
Floating Point Rounding in Shell
Possible to Assign a New Ip Address on Every Http Request
Cuda - Confusion About The Visual Profiler Results of "Branch" and "Divergent Branch" (2)
Linux Shared Library Depends on Symbols in Another Shared Library Opened by Dlopen with Rtld_Local
Why Does Cat <<< $Var1 Lose Newlines
Is There Some Cases in Which Sigkill Will Not Work