[mpich-discuss] New communicator from connect/accept primitives

Francisco Javier García Blas fjblas at arcos.inf.uc3m.es
Thu Jan 21 05:25:11 CST 2010


Hi Jayesh,

The trick consisted of connect both servers with another intrecommunicator,  I thought that it was not necessary using MPI_Intercomm_create.

This example represents a good scenario of using intercommunicators with isolated servants. It could be interesting to include it into mpich2. examples :D.

Thanks again Rajeev and Jayesh for your helpful tips and time.

Best regards

El 20/01/2010, a las 18:34, Jayesh Krishna escribió:

> Hi,
> I have modified your code (You can download the same at http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/nightly/temp/test_inter_modif.tar.gz) and it works for me (The code is now closer to Rajeev's suggestions).
> Let me know if it works for you.
> 
> Regards,
> Jayesh
> ----- Original Message -----
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> To: mpich-discuss at mcs.anl.gov
> Sent: Wednesday, January 20, 2010 6:47:14 AM GMT -06:00 US/Canada Central
> Subject: Re: [mpich-discuss] New communicator from connect/accept primitives
> 
> I my algorithm server A was connected to 2 clients B and C. Since you have one client connected to 2 servers, I suggested you call
> the client A and the servers B and C and follow the same algorithm. A is the common point that has connections to both B and C,
> hence it is important to follow the algorithm as provided. Also, in one of your files I saw MPI_COMM_NULL as a communicator to
> MPI_Intercomm_create. Although I haven't studied the code in detail, I don't think you can pass COMM_NULL. Use COMM_WORLD as in my
> algorithm.
> 
> Rajeev
> 
> 
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov 
>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
>> Francisco Javier García Blas
>> Sent: Wednesday, January 20, 2010 4:40 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] New communicator from 
>> connect/accept primitives
>> 
>> Hello again,
>> 
>> Rajeev, to clarify the code, I put signatures A,B, and C on each file.
>> 
>> Jayesh, On  MPI_Intercomm_create( comm_agg, 0, pool_comm[1], 
>> 1, 12345 , &comm_aux ) the size of the peer comunicator  is 
>> 1, therefore, passing 1 is incorrect, right?
>> 
>> I got the next error stack on serverC when the last  
>> MPI_Comm_create is invoked. Rest of processes run fine:
>> 
>> No matching pg foung for id = 1024812961 Fatal error in 
>> MPI_Intercomm_create: Internal MPI error!, error stack:
>> MPI_Intercomm_create(580).: MPI_Intercomm_create(MPI_COMM_SELF,
>> local_leader=0, comm=0x84000001, remote_leader=0, tag=12346,
>> newintercomm=0xbfb0a790) failed
>> MPID_GPID_ToLpidArray(382): Internal MPI error: Unknown gpid
>> (1289156231)0[cli_0]: aborting job:
>> Fatal error in MPI_Intercomm_create: Internal MPI error!, error stack:
>> MPI_Intercomm_create(580).: MPI_Intercomm_create(MPI_COMM_SELF,
>> local_leader=0, comm=0x84000001, remote_leader=0, tag=12346,
>> newintercomm=0xbfb0a790) failed
>> MPID_GPID_ToLpidArray(382): Internal MPI error: Unknown gpid 
>> (1289156231)0
>> rank 0 in job 9  compute-1-0_45339   caused collective abort 
>> of all ranks
>>  exit status of rank 0: return code 1
>> 
>> Thanks for all your time.
>> 
>> Best regards
>> 
>> jayesh at mcs.anl.gov escribió:
>>> Hi,
>>> Rajeev, correct me if I got it wrong...
>>> On the client side when creating the intercommunicator you 
>> should specify the client_B intercommunicator with the 
>> client_A intracommunicator (MPI_Intercomm_create( comm_agg, 
>> 0,  pool_comm[1], 1, 12345  , &comm_aux ); ).
>>> Similarly on the server B side you should specify the client_B 
>>> intercommunicator with the local communicator in B 
>> (MPI_Intercomm_create( comm_world, 0, comm_inter, 0, 12345 , 
>> &comm_aux ); ).
>>> Let us know if it works.
>>> 
>>> Regards,
>>> Jayesh
>>> ----- Original Message -----
>>> From: "Francisco Javier García Blas" <fjblas at arcos.inf.uc3m.es>
>>> To: jayesh at mcs.anl.gov
>>> Cc: mpich-discuss at mcs.anl.gov
>>> Sent: Tuesday, January 19, 2010 10:20:46 AM GMT -06:00 US/Canada 
>>> Central
>>> Subject: Re: [mpich-discuss] New communicator from connect/accept 
>>> primitives
>>> 
>>> Hi Jayesh,
>>> 
>>> 
>>> I haven't problem with MPI_Intercomm_merge. I tried to 
>> merge using different directions successfully. I checked also 
>> the size of the new intracommunicator after merging and it is 
>> correct too (size 2). 
>>> 
>>> 
>>> Additionally, yesterday I tried with MPI_Comm_spawn + 
>> MPI_Intercomm_create examples at testcase without problems. 
>> In these cases all the processes on the same group have same 
>> intercommunicators. However, in my case, I am doing something 
>> wrong when three processes call MPI_Intercomm_create over two 
>> remote groups. (AB intra, C inter). Arguments mistake maybe? 
>>> 
>>> 
>>> As suggested Dave, I tried my example with the lasted 
>> stable version of MPICH2, with similar results. 
>>> 
>>> 
>>> Thanks for all
>>> 
>>> 
>>> Regards
>>> 
>>> 
>>> 
>>> El 19/01/2010, a las 16:22, jayesh at mcs.anl.gov escribió: 
>>> 
>>> 
>>> 
>>> Hi,
>>> I haven't looked at your code yet. You can look at the 
>> testcase, testconnect.c ( 
>> https://svn.mcs.anl.gov/repos/mpi/mpich2/trunk/test/mpi/manual
>> /testconnect.c ), in the MPICH2 test suite for a simple 
>> example on how to use connect/accept and intercomm_merge to 
>> create an intracommunicator. 
>>> 
>>> -Jayesh
>>> 
>>> ----- Original Message -----
>>> From: "Francisco Javier García Blas" < fjblas at arcos.inf.uc3m.es >
>>> To: mpich-discuss at mcs.anl.gov
>>> Sent: Monday, January 18, 2010 10:26:08 AM GMT -06:00 US/Canada 
>>> Central
>>> Subject: Re: [mpich-discuss] New communicator from connect/accept 
>>> primitives
>>> 
>>> Hello again,
>>> 
>>> In first place, thanks for response of Rajeev and Jayesh. Following 
>>> Rajeev 's instruccion, I implemented an basic example using 
>>> connect/accept and intercomm_create/merge primitives. I am doing 
>>> something wrong because when MPI_Intercomm_create is 
>> invoked, all the 
>>> processes become blocked. I don't find the error, maybe it 
>> could be a 
>>> bad numeration in local and remote communicator but I tried all the 
>>> combinations.
>>> 
>>> I am using mpich2 1.0.5. 
>>> 
>>> I attach the source code and a makefile. 
>>> 
>>> Best regards
>>> 
>>> Rajeev Thakur escribió: 
>>> 
>>> 
>>> You will need to use intercomm_merge but you have to merge them one
>>> 
>>> 
>>> pair at a time. Example below from an old mail. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Rajeev
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> If you have 3 intercommunicators AB_inter, AC_inter, and 
>> AD_inter, you
>>> 
>>> 
>>> can merge them all into a single
>>> 
>>> 
>>> intercommunicator as follows: 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * begin by doing an MPI_Intercomm_merge on AB_inter, resulting in an
>>> 
>>> 
>>> intracommunicator AB_intra. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * then create an intercommunicator between AB on one side 
>> and C on the
>>> 
>>> 
>>> other
>>> 
>>> 
>>> by using MPI_Intercomm_create. Pass AB_intra as the local_comm on A 
>>> and B,
>>> 
>>> 
>>> MPI_COMM_WORLD as the intracomm on C, and AC_inter as the 
>> peer_comm. 
>>> This
>>> 
>>> 
>>> results in the intercommunicator AB_C_inter. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * then call MPI_Intercomm_merge on it to create the 
>> intracommunicator
>>> 
>>> 
>>> ABC_intra. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * then call MPI_Intercomm_create to create an intercommunicator
>>> 
>>> 
>>> between ABC
>>> 
>>> 
>>> and D just as you did with AB and C above. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * Again do an intercomm_merge. This will give you an 
>> intracommunicator
>>> 
>>> 
>>> containing A, B, C, D. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> * If you want an intercommunicator with A in one group and B,C,D in 
>>> the
>>> 
>>> 
>>> other, as you would get with a single spawn of 3 processes, you have
>>> 
>>> 
>>> to call
>>> 
>>> 
>>> MPI_Comm_split to split this single communicator into two
>>> 
>>> 
>>> intracommunicators, one containing A and the other 
>> containing B,C,D. 
>>> Then
>>> 
>>> 
>>> call MPI_Intercomm_create to create the intercommunicator. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> ----------------------------------------------------------------------
>>> --
>>> 
>>> 
>>> *From:* mpich-discuss-bounces at mcs.anl.gov
>>> 
>>> 
>>> [mailto:mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of
>>> 
>>> 
>>> *Francisco Javier García Blas
>>> 
>>> 
>>> *Sent:* Friday, January 15, 2010 11:09 AM
>>> 
>>> 
>>> *To:* mpich-discuss at mcs.anl.gov
>>> 
>>> 
>>> *Subject:* [mpich-discuss] New communicator from connect/accept
>>> 
>>> 
>>> primitives
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hello all,
>>> 
>>> 
>>> 
>>> 
>>> 
>>> I wondering the possibility of get a new inter-comunicator from N
>>> 
>>> 
>>> communicators, which are results from different calls of
>>> 
>>> 
>>> mpi_comm_connect or mpi_comm_accept. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> My initial solution was first, to get the group of each
>>> 
>>> 
>>> inter-communicator with mpi_comm_group, second, to join all the
>>> 
>>> 
>>> groups into one bigger and finally, to create a new communicator
>>> 
>>> 
>>> from the group with the mpi_comm_create primitive. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Currently I am handling a pool of inter - communicators in order
>>> 
>>> 
>>> to keep the functionality. However this idea is not suitable for
>>> 
>>> 
>>> collective and MPI_ANY_SOURCE sends/recvs. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Exist another way to join all the inter-communicator into one? 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Any suggestion? 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Best regards. 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --------------------------------------------------
>>> 
>>> 
>>> Francisco Javier García Blas
>>> 
>>> 
>>> Computer Architecture, Communications and Systems Area. 
>>> 
>>> 
>>> Computer Science Department. UNIVERSIDAD CARLOS III DE MADRID
>>> 
>>> 
>>> Avda. de la Universidad, 30
>>> 
>>> 
>>> 28911 Leganés (Madrid), SPAIN
>>> 
>>> 
>>> e-mail: fjblas at arcos.inf.uc3m.es < mailto:fjblas at arcos.inf.uc3m.es >
>>> 
>>> 
>>> fjblas at inf.uc3m.es < mailto:fjblas at inf.uc3m.es >
>>> 
>>> 
>>> Phone:(+34) 916249118
>>> 
>>> 
>>> FAX: (+34) 916249129
>>> 
>>> 
>>> --------------------------------------------------
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> ----------------------------------------------------------------------
>>> --
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> 
>>> 
>>> mpich-discuss mailing list
>>> 
>>> 
>>> mpich-discuss at mcs.anl.gov
>>> 
>>> 
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --------------------------------------------------
>>> Francisco Javier García Blas
>>> Computer Architecture, Communications and Systems Area. 
>>> Computer Science Department. UNIVERSIDAD CARLOS III DE 
>> MADRID Avda. de 
>>> la Universidad, 30
>>> 28911 Leganés (Madrid), SPAIN
>>> e-mail: fjblas at arcos.inf.uc3m.es
>>> fjblas at inf.uc3m.es
>>> Phone:(+34) 916249118
>>> FAX: (+34) 916249129
>>> --------------------------------------------------
>>> 
>> 
>> 
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

--------------------------------------------------
Francisco Javier García Blas
Computer Architecture, Communications and Systems Area.
Computer Science Department. UNIVERSIDAD CARLOS III DE MADRID
Avda. de la Universidad, 30
28911 Leganés (Madrid), SPAIN
e-mail: fjblas at arcos.inf.uc3m.es
              fjblas at inf.uc3m.es
Phone:(+34) 916249118
FAX: (+34) 916249129
--------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100121/4d908465/attachment-0001.htm>


More information about the mpich-discuss mailing list