<div class="gmail_quote">I'm sorry if this is the wrong channel of communication for these types of problems. If that is the case, I would appreciate knowing where to go.<br><br>I am aware that Rmpi was mostly developed under LAM-MPI, but I am attempting to deploy it under MPICH2.<br>
MPICH2 has been set up using the "./configure --with-device=ch3:sock" command in order to avoid a bug I was encountering with some of the nodes. Everything else under MPICH2 now works, and I can compile and run the examples without problem. MPICH2 is deployed across the cluster under the /mirror/mpich2 directory. If it's relevant, they also have their home directories for the mpiu user mirrored over I am running into problems with Rmpi.<br>
<br>To install Rmpi, I used my generic mpiu account, and executed the following commands:<br>> install.packages("Rmpi", configure.args="--with-mpi=/mirror/mpich2")<br>
<br>This installation completes without error, and I am able to load the Rmpi library with the "> library(Rmpi)" command from the R prompt.<br><br>This is where my problems occur, and where I could use your advice.<br>
<br>If I start the mpd daemon with 1 node using the following command:<br>$ mpdboot -n 1 -v<br>then I can successfully start use<br>> mpi.spawn.Rslaves()<br>command to start the Rslaves with the following output<br clear="all">
1 slaves are spawned successfully. 0 failed.<br>
master (rank 0, comm 1) of size 2 is running on: hal <br>
slave1 (rank 1, comm 1) of size 2 is running on: hal<br>
> mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))<br>$slave1<br>[1] "I am 1 of 2"<br>> mpmpi.close.Rslaves()<br>mpi.close.Rslaves()<br>[1] 1<br>> mpi.quit()<br>> Error: unexpected '>' in ">"<br>
> mpi.quit()<br>mpi.quit() <br>mpi.quit()<br><br>There seems to be some error (possibly permissions?) and after getting back to the $ prompt, I get a lot of errors in the following form:<br>mpiexec_hal (handle_stdin_input 1089): stdin problem; if pgm is run in background, redirect from /dev/null<br>
mpiexec_hal (handle_stdin_input 1090): e.g.: mpiexec -n 4 a.out < /dev/null<br><br><br>After doing this, I can <br>However, if I start the mpd daemon with 2 (or more) nodes, using the following commands from the R prompt:<br>
> library("Rmpi")<br>> mpi.spawn.Rslaves()<br>I immediately get the following error:<br><br>
Error in mpi.comm.spawn(slave = system.file("Rslaves.sh", package = "Rmpi"), : <br> Other MPI error, error stack:<br>MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="/home/mpiu/R/i486-pc-linux-gnu-library/2.6/Rmpi/Rslaves.sh", argv=0x8b8ce20, maxprocs=1, MPI_INFO_NULL, root=0, MPI_COMM_SELF, intercomm=0x88cd0e0, errors=0x80ff870) failed<br>
MPIDI_Comm_spawn_multiple(233): PMI_Spawn_multiple failed<br><br>For this particular error, the output of "mpdtrace -l" is:<br>hal_43272 (192.168.100.1)<br>n01_55355 (192.168.100.101)<br><br>Where hal is the name of the master node with mpd listening on port 43272, and n01 is the slave node listening on port 55355.<br>
<br>I have tried several different versions of Rmpi (0.5-7 and 0.5-8), but get the same error regardless.<br><br>This error seems to be caused within the mpi_comm_spawn(...) call under the ./src/Rmpi.c file of the Rmpi package.<br>
<br>I am completely baffled by this, and any help (or a good mailing list from which to ask for help) would be very much appreciated.<br><br>Thank you for your time,<br>Cye Stoner<br><font color="#888888"><br><br>-- <br>"If you already know what recursion is, just remember the answer. Otherwise, find someone who is standing closer to<br>
Douglas Hofstadter than you are; then ask him or her what recursion is." - Andrew Plotkin<br>
</font></div><br><br clear="all"><br>-- <br>"If you already know what recursion is, just remember the answer. Otherwise, find someone who is standing closer to<br>Douglas Hofstadter than you are; then ask him or her what recursion is." - Andrew Plotkin<br>