[mpich-discuss] MPD in the PBS environment

Anthony Chan chan at mcs.anl.gov
Thu Feb 12 10:48:23 CST 2009


Did you set NNODES in your PBS script ?

----- "Anne M. Hammond" <hammond at txcorp.com> wrote:

> These are the relevant lines from the qsub file:
> 
> sort -u $PBS_NODEFILE > mpd.hosts
> mpdboot -f mpd.hosts -n $NNODES --rsh=/usr/bin/rsh
> mpiexec -machinefile $PBS_NODEFILE -np $NNODES $RUNJOB -i
> $WORK_AREA/$PREFILE/$PREFILE.in -dim 2 -n 100000 -d 10000 >
> $PREFILE.log
> mpdallexit
> 
> mpd.hosts:
> node12
> node13
> 
> When the ring is not running, this is the error message from the
> PBS job:
> 
> mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root
>      probable cause:  no mpd daemon on this machine
>      possible cause:  unix socket /tmp/mpd2.console_root has been
> removed
> mpiexec_node12.cl.corp.com (__init__ 1190): forked process failed; 
> status=255
> 
> Do you have to have a persistent ring booted in order to use mpd
> from PBS?  Or is my qsub script incorrect?
> 
> Thanks in advance,
> Anne


More information about the mpich-discuss mailing list