[MPICH] problems running a program using MPI
Michael Baxa
baxa at uchicago.edu
Fri Feb 22 11:06:30 CST 2008
Hello,
They may very well be elementary issues, but I have two problems that
may or may not be related. These pertain to a Fortran 90 program that
is supposed to use multiple nodes on a Torque (PBS) and Maui job queue
server
Problem #1: Before I call my program, I boot mpd with the following: `
mpdboot --file=$PBS_NODEFILE `
However, only the first node in $PBS_NODEFILE is booted. In an
interactive session, I try to get around this by ssh'ing to the other
nodes in $PBS_NODEFILE and calling ` mpd -h <node1> -p <node1 port> `.
After doing this, `mpdtrace` returns all of the nodes in $PBS_NODEFILE.
I don't know why this isn't done automatically in the first place.
Problem #2: With mpd now booted on all my nodes, I now call my program
using
`mpirun -machinefile $PBS_NODEFILE -np <number of nodes> <program>`
The program seems to hang the first time it comes to MPI_BCAST, i.e. for
an array named temps, I have the following
do i=0,nproc-1
print *, 'i = ',i
call MPI_BCAST (temps(i), 1, MPI_DOUBLE_PRECISION, i,
MPI_COMM_WORLD, ierr)
enddo
The last thing the program outputs is 'i = 0`
One side note, I put a 'call MPI_BARRIER(MPI_COMM_WORLD,ierr)` before
this do loop just to see what would happen, and every node froze at that
line.
I do not know if this is a relevant issue to this forum, but I would
greatly appreciate any suggestions.
Thank you,
Michael Baxa
More information about the mpich-discuss
mailing list