[MPICH] MPI_Comm_spawn, -usize and -machinefile
Martin Siegert
siegert at sfu.ca
Thu Jan 5 20:39:34 CST 2006
Hi,
I am trying to figure out how to use MPI_Comm_spawn. In particular,
I want the slave processes spawned on nodes specified in the
-machinefile argument to mpiexec, e.g.,
mpiexec -machinefile mpihosts -usize 4 -n 1 ./master_prog ./slave_prog
master_prog then calls
MPI_Comm_spawn(argv[1], slave_argv, universe_size-1,
MPI_INFO_NULL, 0, MPI_COMM_SELF, &everyone,
MPI_ERRCODES_IGNORE);
and I expected that those slave processes would run on the remaining
hosts specified in the "mpihosts" file (there are 4 hosts in that file).
That's not what is happening, instead the slaves are spawned on the
first 3 hosts listed by mpdtrace. Is there anyway to have those slaves
started on the nodes specified in the mpihosts file?
Or is the only way to achieve this by doing
export MPD_USE_ROOT_MPD=0
mpdboot -n 4 -f mpihosts
mpiexec -usize 4 -n 1 ./master_prog ./slave_prog
mpdallexit
(this is with mpich2-1.0.3 and I usually use the mpd's started by root
at boot time on each node, i.e., every user by default has the
environment variable MPD_USE_ROOT_MPD set to 1).
Thanks for your advice in advance!
Cheers,
Martin
--
Martin Siegert
Head, HPC at SFU
WestGrid Site Manager
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the mpich-discuss
mailing list