[MPICH] mpich and pbs

Anthony Chan chan at mcs.anl.gov
Wed Jun 20 12:46:50 CDT 2007



On Wed, 20 Jun 2007, Steve Young wrote:

> Hello everyone,
> 	I still seem to be having an issue with getting mpich to work properly.
> I have version mpich2-1.0.5 compiled. This works as expected when I use
> mpiexec or mpirun. However, the nodes that jobs run on aren't in sync
> with the nodes that PBS allocates to the job. In posting to the list
> before I was informed to use the mpiexec from OSC that works with PBS. I
> installed that and jobs now get started on the proper nodes that PBS
> allocates. However, now it appears that the program being run is in
> serial. For example, an 8 cpu job gets stared on two nodes (each node
> has 4 cpu's - 2 dual core opterons). We see all 8 processes running on
> the nodes. But in looking at the output it appears like a serial job. I
> get the same results trying to use vasp and amber. So I'm not sure what
> I could do to correct this. Any ideas?

I am not familar with vasp/amber, so can't comment on the specific.
What kind of output that makes you think that the 8-process job runs
like a serial job ?  From what you said, I assume the job finishes
normally.  If so, you can profile the job with MPE logging/Jumpshot to
check if anything is getting stuck....

A.Chan




More information about the mpich-discuss mailing list