[mpich-discuss] Control getting stucked at mpiecec ./a.out

Dave Goodell goodell at mcs.anl.gov
Wed Apr 6 12:36:25 CDT 2011


MPICH1 is no longer supported, please use MPICH2 instead:

http://www.mcs.anl.gov/research/projects/mpich2/

If the problem persists even with MPICH2, then you should contact someone knowledgeable about PBS (such as the PBS developers or your local system administrators).

-Dave

On Apr 6, 2011, at 12:11 PM CDT, Sanjaya Gajurel wrote:

> Hi,
> 
> We have successfully installed MPICH-1.2.7p1 in our cluster with RHEL-5.5 OS; x86_64 architecture.
> 
> I am able to obtain the correct output, however, the control is getting stucked at mpiexec ./a.out.
> 
> This is the test file main.c.
> 
> ============================================
> #include <stdio.h>
> #include <mpi.h>
> 
> int
> main (int argc, char *argv[])
> {
>   int rank, rc, source, dest, numtasks;
>   int msg;
>   MPI_Status Stat;
> 
>   MPI_Init (&argc, &argv);
>   MPI_Comm_size (MPI_COMM_WORLD, &numtasks);
>   MPI_Comm_rank (MPI_COMM_WORLD, &rank);
> 
> 
>   if (rank == 0)
>     printf ("World is composed by %d nodes\n", numtasks);
> 
>   source = (rank > 0) ? (rank - 1) : numtasks - 1;
>   dest = (rank + 1) % numtasks;
> 
>   rc = MPI_Send (&rank, 1, MPI_INT, dest, 1, MPI_COMM_WORLD);
>   rc = MPI_Recv (&msg, 1, MPI_INT, source, 1, MPI_COMM_WORLD, &Stat);
> 
>   printf
>     ("I am proc number %d getting message %d from source %d and sending message %d to dest %d\n",
>      rank, msg, source, rank, dest);
>  //MPI_Barrier(MPI_COMM_WORLD);
>  MPI_Finalize ();
>   return 0;
> }
> 
> ============================================
> 
> This is the PBS script (test.pbs) to submit the job in our cluster:
> 
> ==============================================
> #PBS -N test
> #PBS -l walltime=00:10:00
> #PBS -l nodes=8:ppn=1
> #PBS -m b
> #PBS -m e
> 
> ##PBS -e test.err -o test.out
> 
> module load mpich
> #module load openmpi
> # cd to the directory where the job was submitted
> cd $PBS_O_WORKDIR
> pbsdcp -s * $PFSDIR
> 
> cd $PFSDIR
> 
> # Execute program
> mpiexec mpitest
> 
> echo "Ready to copy"
> #pbsdcp -g '*' $PBS_O_WORKDIR
> 
> cd $PBS_O_WORKDIR
> 
> ==============================================
> 
> This is the output file:
> 
> =================================================
> I am proc number 1 getting message 0 from source 0 and sending message 1 to dest 2
> World is composed by 8 nodes
> I am proc number 0 getting message 7 from source 7 and sending message 0 to dest 1
> I am proc number 5 getting message 4 from source 4 and sending message 5 to dest 6
> I am proc number 3 getting message 2 from source 2 and sending message 3 to dest 4
> I am proc number 7 getting message 6 from source 6 and sending message 7 to dest 0
> I am proc number 4 getting message 3 from source 3 and sending message 4 to dest 5
> I am proc number 6 getting message 5 from source 5 and sending message 6 to dest 7
> I am proc number 2 getting message 1 from source 1 and sending message 2 to dest 3
> 
> ================================================
> 
> The problem is, the echo command after "mpiexec mpitest" is not executed.
> 
> I would appreciate your help.
> 
> Thanks,
> 
> -Sanjaya
> 
> -- 
> ========================
> Sanjaya Gajurel, Ph.D.
> Computational Scientist
> sxg125 at case.edu
> Advance Research Computing
> 216-368-5717 (office)
> 216-315-4136 (cell)
> Crawford 508
> Case Western Reserve University
> 10900 Euclid Ave
> Cleveland, OH 44106
> =========================
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list