[mpich-discuss] Specifying hosts

Scott Atchley atchley at myri.com
Tue May 5 15:45:09 CDT 2009


Hi all,

I have run into a behavior that I did not expect. I have been using  
mpdboot to launch mpds on hosts with eight cores. I specify a  
machinefile with lines such as:

node1:8
node2:8

I then call mpdboot with:

$ mpdboot -n <num_hosts> -f machines -m `which mpd`

This works and mpdtrace shows all the hosts.

I then launch a job with:

$ mpiexec -n <num_cores> ...

expecting 8 cores per machine _and_ that the ranks are allocated  
sequentially by host. That is ranks 0-7 on the first host, 8-15 on the  
second host, etc.

This does not seem to be the case with 1.0.7 or 1.0.8p1. When I run a  
large CFD code (Overflow), I see MPICH2 take twice as long as Open- 
MPI. I finally tracked it down to ranks not being contiguous. If I  
modify my mpiexec command with:

$ mpiexec -machinefile machines -n <num_cores> ...

where machines is the file I passed to mpdboot, it then runs as fast  
as Open-MPI.

What logic does mpiexec use to assign ranks to hosts? It seems to be  
redundant to pass the machinefile to both mpdboot and mpiexec. In  
Intel MPI, their mpiexec has a -perhost <n> flag that helps accomplish  
this.

Thanks,

Scott


More information about the mpich-discuss mailing list