[mpich-discuss] Specifying hosts

Rajeev Thakur thakur at mcs.anl.gov
Tue May 5 16:22:29 CDT 2009


Scott,
      You can do it with the --ncpus option to mpdboot as described on pg 18
of the installation guide.
http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1
.0.8-installguide.pdf

You can do mpdboot --ncpus=8 -n <num_hosts> -f machines
Then "mpiexec -l -n 24 hostname" will show you how the ranks are allocated.

Rajeev


> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Scott Atchley
> Sent: Tuesday, May 05, 2009 3:45 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] Specifying hosts
> 
> Hi all,
> 
> I have run into a behavior that I did not expect. I have been using  
> mpdboot to launch mpds on hosts with eight cores. I specify a  
> machinefile with lines such as:
> 
> node1:8
> node2:8
> 
> I then call mpdboot with:
> 
> $ mpdboot -n <num_hosts> -f machines -m `which mpd`
> 
> This works and mpdtrace shows all the hosts.
> 
> I then launch a job with:
> 
> $ mpiexec -n <num_cores> ...
> 
> expecting 8 cores per machine _and_ that the ranks are allocated  
> sequentially by host. That is ranks 0-7 on the first host, 
> 8-15 on the  
> second host, etc.
> 
> This does not seem to be the case with 1.0.7 or 1.0.8p1. When 
> I run a  
> large CFD code (Overflow), I see MPICH2 take twice as long as Open- 
> MPI. I finally tracked it down to ranks not being contiguous. If I  
> modify my mpiexec command with:
> 
> $ mpiexec -machinefile machines -n <num_cores> ...
> 
> where machines is the file I passed to mpdboot, it then runs as fast  
> as Open-MPI.
> 
> What logic does mpiexec use to assign ranks to hosts? It seems to be  
> redundant to pass the machinefile to both mpdboot and mpiexec. In  
> Intel MPI, their mpiexec has a -perhost <n> flag that helps 
> accomplish  
> this.
> 
> Thanks,
> 
> Scott
> 



More information about the mpich-discuss mailing list