[mpich-discuss] MPD in the PBS environment
Anthony Chan
chan at mcs.anl.gov
Thu Feb 12 10:48:23 CST 2009
Did you set NNODES in your PBS script ?
----- "Anne M. Hammond" <hammond at txcorp.com> wrote:
> These are the relevant lines from the qsub file:
>
> sort -u $PBS_NODEFILE > mpd.hosts
> mpdboot -f mpd.hosts -n $NNODES --rsh=/usr/bin/rsh
> mpiexec -machinefile $PBS_NODEFILE -np $NNODES $RUNJOB -i
> $WORK_AREA/$PREFILE/$PREFILE.in -dim 2 -n 100000 -d 10000 >
> $PREFILE.log
> mpdallexit
>
> mpd.hosts:
> node12
> node13
>
> When the ring is not running, this is the error message from the
> PBS job:
>
> mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root
> probable cause: no mpd daemon on this machine
> possible cause: unix socket /tmp/mpd2.console_root has been
> removed
> mpiexec_node12.cl.corp.com (__init__ 1190): forked process failed;
> status=255
>
> Do you have to have a persistent ring booted in order to use mpd
> from PBS? Or is my qsub script incorrect?
>
> Thanks in advance,
> Anne
More information about the mpich-discuss
mailing list