[mpich-discuss] Specifying hosts
Rajeev Thakur
thakur at mcs.anl.gov
Tue May 5 16:22:29 CDT 2009
Scott,
You can do it with the --ncpus option to mpdboot as described on pg 18
of the installation guide.
http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1
.0.8-installguide.pdf
You can do mpdboot --ncpus=8 -n <num_hosts> -f machines
Then "mpiexec -l -n 24 hostname" will show you how the ranks are allocated.
Rajeev
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Scott Atchley
> Sent: Tuesday, May 05, 2009 3:45 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] Specifying hosts
>
> Hi all,
>
> I have run into a behavior that I did not expect. I have been using
> mpdboot to launch mpds on hosts with eight cores. I specify a
> machinefile with lines such as:
>
> node1:8
> node2:8
>
> I then call mpdboot with:
>
> $ mpdboot -n <num_hosts> -f machines -m `which mpd`
>
> This works and mpdtrace shows all the hosts.
>
> I then launch a job with:
>
> $ mpiexec -n <num_cores> ...
>
> expecting 8 cores per machine _and_ that the ranks are allocated
> sequentially by host. That is ranks 0-7 on the first host,
> 8-15 on the
> second host, etc.
>
> This does not seem to be the case with 1.0.7 or 1.0.8p1. When
> I run a
> large CFD code (Overflow), I see MPICH2 take twice as long as Open-
> MPI. I finally tracked it down to ranks not being contiguous. If I
> modify my mpiexec command with:
>
> $ mpiexec -machinefile machines -n <num_cores> ...
>
> where machines is the file I passed to mpdboot, it then runs as fast
> as Open-MPI.
>
> What logic does mpiexec use to assign ranks to hosts? It seems to be
> redundant to pass the machinefile to both mpdboot and mpiexec. In
> Intel MPI, their mpiexec has a -perhost <n> flag that helps
> accomplish
> this.
>
> Thanks,
>
> Scott
>
More information about the mpich-discuss
mailing list