[mpich-discuss] Wrong distribution of parallel processes (mpd machinefile)
Scott Atchley
atchley at myri.com
Wed Aug 19 09:41:06 CDT 2009
Mário,
You need to pass the this to mpdboot:
--ncpus=4
Even though you specify it in the machinefile, it needs it on the
command line.
Scott
On Aug 19, 2009, at 10:34 AM, Mário Costa wrote:
> Hello,
>
> I hope someone can help me with the following problem.
>
>
> I'm using mpich2 ch3:nemesis device with mpd.
>
> I use mpdboot to start the mpi ring to start the mpi job using the
> following command:
>
> mpdboot --remcons -n 2 -f machinefile
>
> the machine file has
>
> node001:4
> node002:4
>
> then I start the mpi job via mpiexec:
>
> mpiexec -np 8 ./mpi_executable
>
> Now the problem I'm having is that the node001 has 3 mpi processes and
> the node002 has 5, but it was supposed to be distributed 4 per node,
> as specified in the machinefile.
>
> Does anyone has an idea on what the problem might be?
>
> I'm using mpich2 version 1.0.8, I've used mpich2 version 1.0.5 and I
> had no such problems ... I've also tested with 1.1.1 and had the same
> problem ...
>
> Thanks in advance!
>
> Regards,
> Mário
>
More information about the mpich-discuss
mailing list