[mpich-discuss] Wrong distribution of parallel processes (mpd machinefile)

Scott Atchley atchley at myri.com
Wed Aug 19 09:41:06 CDT 2009


Mário,

You need to pass the this to mpdboot:

--ncpus=4

Even though you specify it in the machinefile, it needs it on the  
command line.

Scott

On Aug 19, 2009, at 10:34 AM, Mário Costa wrote:

> Hello,
>
> I hope someone can help me with the following problem.
>
>
> I'm using mpich2 ch3:nemesis device with mpd.
>
> I use mpdboot to start the mpi ring to start the mpi job using the
> following command:
>
> mpdboot --remcons -n 2 -f machinefile
>
> the machine file has
>
> node001:4
> node002:4
>
> then I start the mpi job via mpiexec:
>
> mpiexec -np 8 ./mpi_executable
>
> Now the problem I'm having is that the node001 has 3 mpi processes and
> the node002 has 5, but it was supposed to be distributed 4 per node,
> as specified in the machinefile.
>
> Does anyone has an idea on what the problem might be?
>
> I'm using mpich2 version 1.0.8, I've used mpich2 version 1.0.5 and I
> had no such problems ... I've also tested with 1.1.1 and had the same
> problem ...
>
> Thanks in advance!
>
> Regards,
> Mário
>



More information about the mpich-discuss mailing list