[mpich-discuss] New communicator from connect/accept primitives
Francisco Javier García Blas
fjblas at arcos.inf.uc3m.es
Wed Jan 20 04:39:37 CST 2010
Hello again,
Rajeev, to clarify the code, I put signatures A,B, and C on each file.
Jayesh, On MPI_Intercomm_create( comm_agg, 0, pool_comm[1], 1, 12345 ,
&comm_aux ) the size of the peer comunicator is 1, therefore, passing 1
is incorrect, right?
I got the next error stack on serverC when the last MPI_Comm_create is
invoked. Rest of processes run fine:
No matching pg foung for id = 1024812961
Fatal error in MPI_Intercomm_create: Internal MPI error!, error stack:
MPI_Intercomm_create(580).: MPI_Intercomm_create(MPI_COMM_SELF,
local_leader=0, comm=0x84000001, remote_leader=0, tag=12346,
newintercomm=0xbfb0a790) failed
MPID_GPID_ToLpidArray(382): Internal MPI error: Unknown gpid
(1289156231)0[cli_0]: aborting job:
Fatal error in MPI_Intercomm_create: Internal MPI error!, error stack:
MPI_Intercomm_create(580).: MPI_Intercomm_create(MPI_COMM_SELF,
local_leader=0, comm=0x84000001, remote_leader=0, tag=12346,
newintercomm=0xbfb0a790) failed
MPID_GPID_ToLpidArray(382): Internal MPI error: Unknown gpid (1289156231)0
rank 0 in job 9 compute-1-0_45339 caused collective abort of all ranks
exit status of rank 0: return code 1
Thanks for all your time.
Best regards
jayesh at mcs.anl.gov escribió:
> Hi,
> Rajeev, correct me if I got it wrong...
> On the client side when creating the intercommunicator you should specify the client_B intercommunicator with the client_A intracommunicator (MPI_Intercomm_create( comm_agg, 0, pool_comm[1], 1, 12345 , &comm_aux ); ).
> Similarly on the server B side you should specify the client_B intercommunicator with the local communicator in B (MPI_Intercomm_create( comm_world, 0, comm_inter, 0, 12345 , &comm_aux );
> ).
> Let us know if it works.
>
> Regards,
> Jayesh
> ----- Original Message -----
> From: "Francisco Javier García Blas" <fjblas at arcos.inf.uc3m.es>
> To: jayesh at mcs.anl.gov
> Cc: mpich-discuss at mcs.anl.gov
> Sent: Tuesday, January 19, 2010 10:20:46 AM GMT -06:00 US/Canada Central
> Subject: Re: [mpich-discuss] New communicator from connect/accept primitives
>
> Hi Jayesh,
>
>
> I haven't problem with MPI_Intercomm_merge. I tried to merge using different directions successfully. I checked also the size of the new intracommunicator after merging and it is correct too (size 2).
>
>
> Additionally, yesterday I tried with MPI_Comm_spawn + MPI_Intercomm_create examples at testcase without problems. In these cases all the processes on the same group have same intercommunicators. However, in my case, I am doing something wrong when three processes call MPI_Intercomm_create over two remote groups. (AB intra, C inter). Arguments mistake maybe?
>
>
> As suggested Dave, I tried my example with the lasted stable version of MPICH2, with similar results.
>
>
> Thanks for all
>
>
> Regards
>
>
>
> El 19/01/2010, a las 16:22, jayesh at mcs.anl.gov escribió:
>
>
>
> Hi,
> I haven't looked at your code yet. You can look at the testcase, testconnect.c ( https://svn.mcs.anl.gov/repos/mpi/mpich2/trunk/test/mpi/manual/testconnect.c ), in the MPICH2 test suite for a simple example on how to use connect/accept and intercomm_merge to create an intracommunicator.
>
> -Jayesh
>
> ----- Original Message -----
> From: "Francisco Javier García Blas" < fjblas at arcos.inf.uc3m.es >
> To: mpich-discuss at mcs.anl.gov
> Sent: Monday, January 18, 2010 10:26:08 AM GMT -06:00 US/Canada Central
> Subject: Re: [mpich-discuss] New communicator from connect/accept primitives
>
> Hello again,
>
> In first place, thanks for response of Rajeev and Jayesh. Following
> Rajeev 's instruccion, I implemented an basic example using
> connect/accept and intercomm_create/merge primitives. I am doing
> something wrong because when MPI_Intercomm_create is invoked, all the
> processes become blocked. I don't find the error, maybe it could be a
> bad numeration in local and remote communicator but I tried all the
> combinations.
>
> I am using mpich2 1.0.5.
>
> I attach the source code and a makefile.
>
> Best regards
>
> Rajeev Thakur escribió:
>
>
> You will need to use intercomm_merge but you have to merge them one
>
>
> pair at a time. Example below from an old mail.
>
>
>
>
>
> Rajeev
>
>
>
>
>
>
>
>
> If you have 3 intercommunicators AB_inter, AC_inter, and AD_inter, you
>
>
> can merge them all into a single
>
>
> intercommunicator as follows:
>
>
>
>
>
> * begin by doing an MPI_Intercomm_merge on AB_inter, resulting in an
>
>
> intracommunicator AB_intra.
>
>
>
>
>
> * then create an intercommunicator between AB on one side and C on the
>
>
> other
>
>
> by using MPI_Intercomm_create. Pass AB_intra as the local_comm on A and B,
>
>
> MPI_COMM_WORLD as the intracomm on C, and AC_inter as the peer_comm. This
>
>
> results in the intercommunicator AB_C_inter.
>
>
>
>
>
> * then call MPI_Intercomm_merge on it to create the intracommunicator
>
>
> ABC_intra.
>
>
>
>
>
> * then call MPI_Intercomm_create to create an intercommunicator
>
>
> between ABC
>
>
> and D just as you did with AB and C above.
>
>
>
>
>
> * Again do an intercomm_merge. This will give you an intracommunicator
>
>
> containing A, B, C, D.
>
>
>
>
>
> * If you want an intercommunicator with A in one group and B,C,D in the
>
>
> other, as you would get with a single spawn of 3 processes, you have
>
>
> to call
>
>
> MPI_Comm_split to split this single communicator into two
>
>
> intracommunicators, one containing A and the other containing B,C,D. Then
>
>
> call MPI_Intercomm_create to create the intercommunicator.
>
>
>
>
>
> ------------------------------------------------------------------------
>
>
> *From:* mpich-discuss-bounces at mcs.anl.gov
>
>
> [mailto:mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of
>
>
> *Francisco Javier García Blas
>
>
> *Sent:* Friday, January 15, 2010 11:09 AM
>
>
> *To:* mpich-discuss at mcs.anl.gov
>
>
> *Subject:* [mpich-discuss] New communicator from connect/accept
>
>
> primitives
>
>
>
>
>
> Hello all,
>
>
>
>
>
> I wondering the possibility of get a new inter-comunicator from N
>
>
> communicators, which are results from different calls of
>
>
> mpi_comm_connect or mpi_comm_accept.
>
>
>
>
>
> My initial solution was first, to get the group of each
>
>
> inter-communicator with mpi_comm_group, second, to join all the
>
>
> groups into one bigger and finally, to create a new communicator
>
>
> from the group with the mpi_comm_create primitive.
>
>
>
>
>
> Currently I am handling a pool of inter - communicators in order
>
>
> to keep the functionality. However this idea is not suitable for
>
>
> collective and MPI_ANY_SOURCE sends/recvs.
>
>
>
>
>
> Exist another way to join all the inter-communicator into one?
>
>
>
>
>
> Any suggestion?
>
>
>
>
>
> Best regards.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --------------------------------------------------
>
>
> Francisco Javier García Blas
>
>
> Computer Architecture, Communications and Systems Area.
>
>
> Computer Science Department. UNIVERSIDAD CARLOS III DE MADRID
>
>
> Avda. de la Universidad, 30
>
>
> 28911 Leganés (Madrid), SPAIN
>
>
> e-mail: fjblas at arcos.inf.uc3m.es < mailto:fjblas at arcos.inf.uc3m.es >
>
>
> fjblas at inf.uc3m.es < mailto:fjblas at inf.uc3m.es >
>
>
> Phone:(+34) 916249118
>
>
> FAX: (+34) 916249129
>
>
> --------------------------------------------------
>
>
>
>
>
> ------------------------------------------------------------------------
>
>
>
>
>
> _______________________________________________
>
>
> mpich-discuss mailing list
>
>
> mpich-discuss at mcs.anl.gov
>
>
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
>
>
>
> --------------------------------------------------
> Francisco Javier García Blas
> Computer Architecture, Communications and Systems Area.
> Computer Science Department. UNIVERSIDAD CARLOS III DE MADRID
> Avda. de la Universidad, 30
> 28911 Leganés (Madrid), SPAIN
> e-mail: fjblas at arcos.inf.uc3m.es
> fjblas at inf.uc3m.es
> Phone:(+34) 916249118
> FAX: (+34) 916249129
> --------------------------------------------------
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_inter.tgz
Type: application/octet-stream
Size: 1410 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100120/05c0bb88/attachment.obj>
More information about the mpich-discuss
mailing list