[MPICH] Problem with MPICH2 mpiexec with different executables

Matthew Siegel siegelmatt at gmail.com
Thu May 4 22:58:35 CDT 2006


OK, so I guess that makes sense that you can not run a non-mpi program, and
an mpi program.

The real programs that we are trying to run, one has the MPI::Init() right
at the top of the main function, and the other has it buried all the way
down inside of a constructor of an object that it creates.  Is there an
issue with when each app would call MPI::Init()??

Thinking about this more, I would guess that MPI::Init() would have to be
called in the main thread, and right now I'm not sure that this object is
being created in the main thread.  Will have to check when I'm in front of
the code.

Thanks alot for the help . . . still learning all the intricacies of MPI and
MPICH.

Matt

On 5/4/06, Ralph Butler <rbutler at mtsu.edu> wrote:
>
> > Hi Matthew:
> >
> > If you run something like this:
> >     mpiexec -n 1 master_mpi_pgm : -n 8 slave_mpi_pgm
> > you are saying that you have a 9 process MPI process job that
> > happens to run 2 different programs.
> > If you were to call MPI_Comm_size in this program, you would get 21.
>
>      Correction: you would get 9
>
> > When running MPI programs, mpd assumes that the entire set of of
> > processes are part of the computation.
> > So using mpd, it is not allowed to run a mixed bag of MPI and non-
> > MPI processes in a single job.
> > If you run this:
> >     mpiexec -n 1 mpi_pgm : -n 1 non_mpi_pgm
> > The mpi_pgm will hang at MPI_Init trying to set up the communicator
> > for what it believes is a 2 process job.
> >
> > --ralph
> >
> > On May 4, 2006, at 8:55 PM, Matthew Siegel wrote:
> >
> >> Hi all,
> >>
> >> Thanks for the help in advance.  Here's a detailed explanantion of
> >> the problem I'm running in to.
> >>
> >> A little background . . . I am running the latest MPICH2.  I am
> >> using Rocks, running on Xeon EM64T processors with Gig-E IP
> >> between the compute nodes.  Not completely relevant, but want to
> >> be complete.
> >>
> >> I have written an app that has the following source code.  It is a
> >> very very simple because my real program which is quite
> >> complicated was not working, and I figured that this app would
> >> work just fine.
> >>
> >> #include < mpi.h>
> >> #include <stdio.h>
> >>
> >> int main(int argc, char** argv) {
> >>     MPI::Init(argc, argv);
> >>     printf("My app is running!!!!\n");
> >>     MPI::Finalize();
> >>     return 0;
> >> }
> >>
> >> I compiled it like this:
> >>     mpicxx -o my_app my_app.cpp
> >>
> >> I start an mpd daemon on the the head node 'mpd &', and verify
> >> with mpdtrace.  So far so good.
> >>
> >> I then execute the following:
> >>     mpiexec -l -n 1 ./myapp
> >>
> >> and I get:
> >>     0: My app is running!!!!
> >>
> >> and it quits.  I then run:
> >>     mpiexec -l -n 4 ./myapp
> >>
> >> and I get (as expected):
> >>     0: My app is running!!!!
> >>     1: My app is running!!!!
> >>     3: My app is running!!!!
> >>     2: My app is running!!!!
> >>
> >> OK, so this is good so far.   Here's where things go awry...
> >>
> >> I then run:
> >>     mpiexec -l -n 1 hostname : -n 1 date
> >>
> >> and I get (again as expected):
> >>     0: <hostname>
> >>     1: <date>
> >>
> >> Then I run (and here's where the problem is):
> >>     mpiexec -l -n 1 ./myapp : -n 1 hostname
> >>
> >> and I get:
> >>     1: <hostname>
> >>
> >> And NOTHING else!  Just hangs FOREVER . . . have to hit CTRL-C to
> >> quit.
> >>
> >> This continues regardless of the order that I run my app in,
> >> whether it's the first or second on the line, it does not matter.
> >> Also, this is true regardless of the "other" app that I am
> >> running, not necessarilly just 'hostname' or 'date', and does not
> >> matter how many nodes that I am using.  The real apps that I am
> >> trying to run both are using MPI and am trying to pass data
> >> between the two processes.  I have rebuilt MPICH2 multiple times,
> >> and tried various configure options with zero luck.
> >>
> >> Please help, this has become a showstopper for me.
> >>
> >> Thanks!
> >>
> >> Matt
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20060504/249060d6/attachment.htm>


More information about the mpich-discuss mailing list