[mpich-discuss] Error with MPI_Comm_spawn on MPICH2

Pavan Balaji balaji at mcs.anl.gov
Mon Nov 24 23:53:52 CST 2008


The default netmod in ch3:nemesis doesn't support dynamic processes in 
the 1.0.x series of MPICH2. You can either upgrade to 1.1a2 (preferred), 
or configure mpich2 with --with-device=ch3:sock if you want to stick 
with 1.0.8.

  -- Pavan

On 11/24/2008 11:48 PM, Shah, Mrunal J wrote:
> mpich2version
> MPICH2 Version:         1.0.8
> MPICH2 Release date:    Unknown, built on Fri Nov  7 23:45:25 EST 2008
> MPICH2 Device:          ch3:nemesis
> MPICH2 configure:       --prefix=/usr/local/apps/mpich2/1.0.8-nemesis--x86_64-gcc4.3.2 --with-device=ch3:nemesis
> MPICH2 CC:      gcc  -O2
> MPICH2 CXX:     c++  -O2
> MPICH2 F77:     g77  -O2
> MPICH2 F90:     pgf90
> 
> Am using MPD.
> 
> To launch my app I use
> mpd &
> mpiexec -n 4 ./newspawn
> 
> 
> Thanks
> Mrunal
> 
> ----- Original Message -----
> From: "Pavan Balaji" <balaji at mcs.anl.gov>
> To: mpich-discuss at mcs.anl.gov
> Sent: Tuesday, November 25, 2008 12:29:14 AM GMT -05:00 US/Canada Eastern
> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
> 
> Mrunal,
> 
> You can install mpich2 in your home directory without any root access to 
> try it out.
> 
> But before you do that, can you let us know what process manager you are 
> using? Can you send us the output for "mpich2version" and also the steps 
> you are following to launch your application?
> 
>   -- Pavan
> 
> On 11/24/2008 11:26 PM, Shah, Mrunal J wrote:
>> Thanks Rajeev,
>>
>> I am student at Georgia Tech and am working on an MPICH installation present on one of the college clusters.
>> I will ask for permissions to re install it and let you know if I see anything out of the ordinary.
>>
>> Thanks a lot again!
>> Mrunal
>>
>> ----- Original Message -----
>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>> To: mpich-discuss at mcs.anl.gov
>> Sent: Monday, November 24, 2008 11:59:40 PM GMT -05:00 US/Canada Eastern
>> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>
>> I can run this with 1.0.8 without any problem. The program is too simple to
>> fail. Probably something wrong with your installation. Did you use any
>> special configure options or compilers? Try starting from scratch again with
>> a fresh untar of the tar file. If you notice anything unusual in the output
>> of configure and make that is printed on the screen, send it to us.
>>
>> Rajeev 
>>
>>> -----Original Message-----
>>> From: mpich-discuss-bounces at mcs.anl.gov 
>>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Shah, Mrunal J
>>> Sent: Monday, November 24, 2008 10:33 PM
>>> To: mpich-discuss at mcs.anl.gov
>>> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>>
>>> #include "mpi.h"
>>> #include <stdio.h>
>>>
>>> int main(int argc, char **argv)
>>> {
>>>         int size, myRank;
>>>         MPI_Comm parent;
>>>         MPI_Init(&argc, &argv);
>>>         MPI_Comm_get_parent(&parent);
>>>         MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
>>>
>>>         MPI_Comm_remote_size(parent, &size);
>>>         printf("IN WORKER: parent size %d\n", size);
>>>
>>>         printf("\nhello from worker %d", myRank);
>>>         MPI_Finalize();
>>>         return 0;
>>> }
>>>
>>> Thanks again!
>>> Mrunal
>>>
>>> ----- Original Message -----
>>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>>> To: mpich-discuss at mcs.anl.gov
>>> Sent: Monday, November 24, 2008 11:20:42 PM GMT -05:00 
>>> US/Canada Eastern
>>> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>>
>>> Can you send the worker program as well?
>>>
>>> Rajeev 
>>>
>>>> -----Original Message-----
>>>> From: mpich-discuss-bounces at mcs.anl.gov 
>>>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
>>> Shah, Mrunal J
>>>> Sent: Monday, November 24, 2008 10:18 PM
>>>> To: mpich-discuss at mcs.anl.gov
>>>> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>>>
>>>> Using 1.0.8
>>>> Running on a x86_64 machine
>>>>
>>>> The test program:
>>>>
>>>> #include "mpi.h"
>>>>
>>>> int main(int argc, char **argv)
>>>> {
>>>>         int world_size, flag, myRank;
>>>>         MPI_Comm everyone;
>>>>
>>>>         MPI_Init(&argc, &argv);
>>>>         MPI_Comm_size(MPI_COMM_WORLD, &world_size);
>>>>         MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
>>>>
>>>>         printf("world_size %d\n", world_size);
>>>>
>>>>         MPI_Comm_spawn("./workertest", MPI_ARGV_NULL, world_size, 
>>>> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &everyone, MPI_ERRCODES_IGNORE);
>>>>
>>>>         printf("hello from master %d", myRank);
>>>>         MPI_Finalize();
>>>>         return 0;
>>>> }
>>>>
>>>> Thanks,
>>>> Mrunal
>>>>
>>>> ----- Original Message -----
>>>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>>>> To: mpich-discuss at mcs.anl.gov
>>>> Sent: Monday, November 24, 2008 11:04:03 PM GMT -05:00 US/Canada 
>>>> Eastern
>>>> Subject: Re: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>>>
>>>> Which version of MPICH2 are you using and on what platform? 
>>>> Please send us a small test program if you can.
>>>>
>>>> Rajeev
>>>>
>>>>> -----Original Message-----
>>>>> From: mpich-discuss-bounces at mcs.anl.gov 
>>>>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
>>>> Shah, Mrunal J
>>>>> Sent: Monday, November 24, 2008 8:32 PM
>>>>> To: mpich-discuss at mcs.anl.gov
>>>>> Subject: [mpich-discuss] Error with MPI_Comm_spawn on MPICH2
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I am having issues with spawning processes using MPI_Comm_spawn.
>>>>> Using a simple example for spawning processes:
>>>>>
>>>>>  MPI_Comm_spawn("./workertest", MPI_ARGV_NULL, world_size, 
>>>>> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &everyone, MPI_ERRCODES_IGNORE);
>>>>>
>>>>> The error I get is:
>>>>> Fatal error in MPI_Comm_spawn: Internal MPI error!, error stack:
>>>>> MPI_Comm_spawn(130)............: 
>>>>> MPI_Comm_spawn(cmd="./workertest", argv=(nil), maxprocs=4, 
>>>>> MPI_INFO_NULL, root=0, MPI_COMM_WORLD, intercomm=0x7fbffff6ac,
>>>>> errors=(nil)) failed
>>>>> MPIDI_Comm_spawn_multiple(172).:
>>>>> MPID_Open_port(65).............:
>>>>> MPIDI_CH3_Get_business_card(99): Internal MPI 
>>> error!universe_size 1
>>>>> I believe this could be a machine specific error, since 
>>> running the 
>>>>> same code on another machine with a different 
>>> implementation of MPI
>>>>> (OpenMPI) does not give any errors.
>>>>> But I am not sure what I am missing here.
>>>>>
>>>>> Thanks!
>>>>> Mrunal
>>>>>
> 

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the mpich-discuss mailing list