[mpich-discuss] Specifying hosts
Scott Atchley
atchley at myri.com
Tue May 5 15:45:09 CDT 2009
Hi all,
I have run into a behavior that I did not expect. I have been using
mpdboot to launch mpds on hosts with eight cores. I specify a
machinefile with lines such as:
node1:8
node2:8
I then call mpdboot with:
$ mpdboot -n <num_hosts> -f machines -m `which mpd`
This works and mpdtrace shows all the hosts.
I then launch a job with:
$ mpiexec -n <num_cores> ...
expecting 8 cores per machine _and_ that the ranks are allocated
sequentially by host. That is ranks 0-7 on the first host, 8-15 on the
second host, etc.
This does not seem to be the case with 1.0.7 or 1.0.8p1. When I run a
large CFD code (Overflow), I see MPICH2 take twice as long as Open-
MPI. I finally tracked it down to ranks not being contiguous. If I
modify my mpiexec command with:
$ mpiexec -machinefile machines -n <num_cores> ...
where machines is the file I passed to mpdboot, it then runs as fast
as Open-MPI.
What logic does mpiexec use to assign ranks to hosts? It seems to be
redundant to pass the machinefile to both mpdboot and mpiexec. In
Intel MPI, their mpiexec has a -perhost <n> flag that helps accomplish
this.
Thanks,
Scott
More information about the mpich-discuss
mailing list