[MPICH] Problem with MPICH2 mpiexec with different executables

Rajeev Thakur thakur at mcs.anl.gov
Fri May 5 12:53:45 CDT 2006


> I will defer to others on when MPI_Init needs to be called, but it  
> seems that mpich2 treats it somewhat like
> a barrier with each process hanging until they all arrive.

Yes, MPI_Init calls the process manager's barrier before it returns. All
processes that are part of comm_world must call MPI_Init before any of the
MPI_Inits will return.

Rajeev


> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Ralph Butler
> Sent: Thursday, May 04, 2006 11:07 PM
> To: Matthew Siegel
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] Problem with MPICH2 mpiexec with 
> different executables
> 
> I will defer to others on when MPI_Init needs to be called, but it  
> seems that mpich2 treats it somewhat like
> a barrier with each process hanging until they all arrive.
> 
> On May 4, 2006, at 10:58 PM, Matthew Siegel wrote:
> 
> > OK, so I guess that makes sense that you can not run a non-mpi  
> > program, and an mpi program.
> >
> > The real programs that we are trying to run, one has the MPI::Init 
> > () right at the top of the main function, and the other has it  
> > buried all the way down inside of a constructor of an object that  
> > it creates.  Is there an issue with when each app would call  
> > MPI::Init()??
> >
> > Thinking about this more, I would guess that MPI::Init() 
> would have  
> > to be called in the main thread, and right now I'm not sure that  
> > this object is being created in the main thread.  Will have to  
> > check when I'm in front of the code.
> >
> > Thanks alot for the help . . . still learning all the intricacies  
> > of MPI and MPICH.
> >
> > Matt
> >
> > On 5/4/06, Ralph Butler < rbutler at mtsu.edu> wrote:> Hi Matthew:
> > >
> > > If you run something like this:
> > >     mpiexec -n 1 master_mpi_pgm : -n 8 slave_mpi_pgm
> > > you are saying that you have a 9 process MPI process job that
> > > happens to run 2 different programs.
> > > If you were to call MPI_Comm_size in this program, you 
> would get 21.
> >
> >      Correction: you would get 9
> >
> > > When running MPI programs, mpd assumes that the entire set of of
> > > processes are part of the computation.
> > > So using mpd, it is not allowed to run a mixed bag of MPI and non-
> > > MPI processes in a single job.
> > > If you run this:
> > >     mpiexec -n 1 mpi_pgm : -n 1 non_mpi_pgm
> > > The mpi_pgm will hang at MPI_Init trying to set up the 
> communicator
> > > for what it believes is a 2 process job.
> > >
> > > --ralph
> > >
> > > On May 4, 2006, at 8:55 PM, Matthew Siegel wrote:
> > >
> > >> Hi all,
> > >>
> > >> Thanks for the help in advance.  Here's a detailed 
> explanantion of
> > >> the problem I'm running in to.
> > >>
> > >> A little background . . . I am running the latest MPICH2.  I am
> > >> using Rocks, running on Xeon EM64T processors with Gig-E IP
> > >> between the compute nodes.  Not completely relevant, but want to
> > >> be complete.
> > >>
> > >> I have written an app that has the following source 
> code.  It is a
> > >> very very simple because my real program which is quite
> > >> complicated was not working, and I figured that this app would
> > >> work just fine.
> > >>
> > >> #include < mpi.h>
> > >> #include <stdio.h>
> > >>
> > >> int main(int argc, char** argv) {
> > >>     MPI::Init(argc, argv);
> > >>     printf("My app is running!!!!\n");
> > >>     MPI::Finalize();
> > >>     return 0;
> > >> }
> > >>
> > >> I compiled it like this:
> > >>     mpicxx -o my_app my_app.cpp
> > >>
> > >> I start an mpd daemon on the the head node 'mpd &', and verify
> > >> with mpdtrace.  So far so good.
> > >>
> > >> I then execute the following:
> > >>     mpiexec -l -n 1 ./myapp
> > >>
> > >> and I get:
> > >>     0: My app is running!!!!
> > >>
> > >> and it quits.  I then run:
> > >>     mpiexec -l -n 4 ./myapp
> > >>
> > >> and I get (as expected):
> > >>     0: My app is running!!!!
> > >>     1: My app is running!!!!
> > >>     3: My app is running!!!!
> > >>     2: My app is running!!!!
> > >>
> > >> OK, so this is good so far.   Here's where things go awry...
> > >>
> > >> I then run:
> > >>     mpiexec -l -n 1 hostname : -n 1 date
> > >>
> > >> and I get (again as expected):
> > >>     0: <hostname>
> > >>     1: <date>
> > >>
> > >> Then I run (and here's where the problem is):
> > >>     mpiexec -l -n 1 ./myapp : -n 1 hostname
> > >>
> > >> and I get:
> > >>     1: <hostname>
> > >>
> > >> And NOTHING else!  Just hangs FOREVER . . . have to hit CTRL-C to
> > >> quit.
> > >>
> > >> This continues regardless of the order that I run my app in,
> > >> whether it's the first or second on the line, it does not matter.
> > >> Also, this is true regardless of the "other" app that I am
> > >> running, not necessarilly just 'hostname' or 'date', and does not
> > >> matter how many nodes that I am using.  The real apps that I am
> > >> trying to run both are using MPI and am trying to pass data
> > >> between the two processes.  I have rebuilt MPICH2 multiple times,
> > >> and tried various configure options with zero luck.
> > >>
> > >> Please help, this has become a showstopper for me.
> > >>
> > >> Thanks!
> > >>
> > >> Matt
> > >
> >
> >
> 
> 




More information about the mpich-discuss mailing list