[MPICH] Problem with MPICH2 mpiexec with different executables
Ralph Butler
rbutler at mtsu.edu
Thu May 4 22:44:11 CDT 2006
> Hi Matthew:
>
> If you run something like this:
> mpiexec -n 1 master_mpi_pgm : -n 8 slave_mpi_pgm
> you are saying that you have a 9 process MPI process job that
> happens to run 2 different programs.
> If you were to call MPI_Comm_size in this program, you would get 21.
Correction: you would get 9
> When running MPI programs, mpd assumes that the entire set of of
> processes are part of the computation.
> So using mpd, it is not allowed to run a mixed bag of MPI and non-
> MPI processes in a single job.
> If you run this:
> mpiexec -n 1 mpi_pgm : -n 1 non_mpi_pgm
> The mpi_pgm will hang at MPI_Init trying to set up the communicator
> for what it believes is a 2 process job.
>
> --ralph
>
> On May 4, 2006, at 8:55 PM, Matthew Siegel wrote:
>
>> Hi all,
>>
>> Thanks for the help in advance. Here's a detailed explanantion of
>> the problem I'm running in to.
>>
>> A little background . . . I am running the latest MPICH2. I am
>> using Rocks, running on Xeon EM64T processors with Gig-E IP
>> between the compute nodes. Not completely relevant, but want to
>> be complete.
>>
>> I have written an app that has the following source code. It is a
>> very very simple because my real program which is quite
>> complicated was not working, and I figured that this app would
>> work just fine.
>>
>> #include < mpi.h>
>> #include <stdio.h>
>>
>> int main(int argc, char** argv) {
>> MPI::Init(argc, argv);
>> printf("My app is running!!!!\n");
>> MPI::Finalize();
>> return 0;
>> }
>>
>> I compiled it like this:
>> mpicxx -o my_app my_app.cpp
>>
>> I start an mpd daemon on the the head node 'mpd &', and verify
>> with mpdtrace. So far so good.
>>
>> I then execute the following:
>> mpiexec -l -n 1 ./myapp
>>
>> and I get:
>> 0: My app is running!!!!
>>
>> and it quits. I then run:
>> mpiexec -l -n 4 ./myapp
>>
>> and I get (as expected):
>> 0: My app is running!!!!
>> 1: My app is running!!!!
>> 3: My app is running!!!!
>> 2: My app is running!!!!
>>
>> OK, so this is good so far. Here's where things go awry...
>>
>> I then run:
>> mpiexec -l -n 1 hostname : -n 1 date
>>
>> and I get (again as expected):
>> 0: <hostname>
>> 1: <date>
>>
>> Then I run (and here's where the problem is):
>> mpiexec -l -n 1 ./myapp : -n 1 hostname
>>
>> and I get:
>> 1: <hostname>
>>
>> And NOTHING else! Just hangs FOREVER . . . have to hit CTRL-C to
>> quit.
>>
>> This continues regardless of the order that I run my app in,
>> whether it's the first or second on the line, it does not matter.
>> Also, this is true regardless of the "other" app that I am
>> running, not necessarilly just 'hostname' or 'date', and does not
>> matter how many nodes that I am using. The real apps that I am
>> trying to run both are using MPI and am trying to pass data
>> between the two processes. I have rebuilt MPICH2 multiple times,
>> and tried various configure options with zero luck.
>>
>> Please help, this has become a showstopper for me.
>>
>> Thanks!
>>
>> Matt
>
More information about the mpich-discuss
mailing list