[mpich-discuss] mpich2 MPI_TEST errors

Samir Khanal skhanal at bgsu.edu
Sun Mar 15 19:52:03 CDT 2009


Hi Pavan

my mpd.hosts file already contains
the following entries

compute-0-0:4
compute-0-1:4
compute-0-2:4
compute-0-3:4
compute-0-4:4
compute-0-5:4

and i have already started mpd on all the nodes

mpdboot -n 7

do i need to specify this again in the PBS submit script?

Again I tried this and 

mpiexec -n 1 ./Ring works
but 
mpiexec -n 2 ./Ring doesnot work.


MPICH2 Version:         1.0.8
MPICH2 Release date:    Unknown, built on Fri Feb 20 12:36:01 EST 2009
MPICH2 Device:          ch3:nemesis
MPICH2 configure:       --prefix=/home/skhanal/mpich2 --with-device=ch3:nemesis
MPICH2 CC:      gcc  -O2
MPICH2 CXX:     c++  -O2
MPICH2 F77:     gfortran  -O2
MPICH2 F90:     f95  -O2

please help
Samir
________________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Pavan Balaji [balaji at mcs.anl.gov]
Sent: Sunday, March 15, 2009 5:19 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] mpich2 MPI_TEST errors

> I found the Culprit function
> it was indeed a problem with mpi_test call , i tracked it down, the programs works now.

Great!

> But now i am having a hard time using the same program to run on mpich2 1.0.8/PBS on a x86_64 system.
> it compiles and runs perfectly as a single process,
> ie, mpiexec -n 1 ./Ring
> executes and generates outputs.
>
> but as soon as i do mpiexec -n 2 or more , it just waits and eventually the job is thrown out of the queue.

Did you launch your mpd daemons correctly? See section 5.7.1 in the
MPICH2 users' guide:
http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1.0.8-userguide.pdf

PBS uses a slightly different node name representation that MPICH2's
MPD, but it should be trivial to convert between the two formats.

> Does mpich2 has any special configurations with multiple core machines?
> Any tips on job submission or compiling,
> if just used
>
> ./configure --with-device=ch3:nemesis

It'll automatically detect multi-core systems and optimize inter-core
communication.

  -- Pavan

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list