[mpich-discuss] Using MPI_Comm_split to MPI_COMM_LOCAL

Jeff Hammond jhammond at mcs.anl.gov
Tue Nov 9 14:35:35 CST 2010


More than 64K nodes is not a problem because the only relevant
machines are Blue Gene architecture and one need not resort to any of
this funny business with IP addresses to create MPI_COMM_NODE
communicators.

Assuming you are in a homogeneous environment and the ranks are
numbered depth-first, it would seem that you could form the
communicators using the following sequence of calls:

ppn = atoi(getenv("PROCESSES_PER_NODE"));
MPI_Comm_group(MPI_COMM_WORLD,&MPI_GROUP_WORLD);
MPI_Group_size(MPI_GROUP_WORLD,&size);
MPI_Group_rank(MPI_GROUP_WORLD,&myrank);
nodes=size/ppn; // assumes homogeneity of ppn
mynode=floor(myrank/ppn);
int ranks[ppn];
for (i=0;i<ppn;i++) ranks[i] = node+i;
MPI_Group_incl(MPI_GROUP_WORLD,ppn,ranks,MPI_GROUP_NODE);
MPI_Comm_create(MPI_COMM_WORLD,MPI_GROUP_NODE,MPI_COMM_NODE);

The implementation above is approximate and I didn't verify it works;
neither can I comment on the scalability of any of these calls as used
above.

For a breadth-first rank assignment, the details would change but the
essence of the above implementation would not.

Jeff

On Tue, Nov 9, 2010 at 9:32 AM, Bill Rankin <Bill.Rankin at sas.com> wrote:
> Hi John,
>
> Could you just mask 16 bits out of the IP address (ie. use a pseudo-netmask?) to get a unique color? I could envision a network architecture where that would not work - nodes randomly assigned IP out of an address space larger than 16 bits wide.  But clusters are usually organized much more regularly than that.
>
> You run into a problem at 64k nodes in any case, but until then you are fine. :-)
>
> -b
>
>
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-
>> bounces at mcs.anl.gov] On Behalf Of John Bent
>> Sent: Monday, November 08, 2010 5:47 PM
>> To: mpich-discuss
>> Subject: [mpich-discuss] Using MPI_Comm_split to MPI_COMM_LOCAL
>>
>> All,
>>
>> We'd like to create an MPI Communicator for just the processes on each
>> local node (i.e. something like MPI_COMM_LOCAL).  We were doing this
>> previously very naively by having everyone send out their hostnames and
>> then doing string parsing.  We realize that a much simpler way to do it
>> would be to use MPI_Comm_split to split MPI_COMM_WORLD by the IP
>> address.  Unfortunately, the IP address is 64 bits and the max "color"
>> to pass to MPI_Comm_split is only 2^16.  So we're currently planning on
>> splitting iteratively on each 16 bits in the 64 bit IP address.
>>
>> Anyone know a better way to achieve MPI_COMM_LOCAL?  Or can
>> MPI_Comm_split be enhanced to take a 64 bit color?
>> --
>> Thanks,
>>
>> John
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



-- 
Jeff Hammond
Argonne Leadership Computing Facility
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond


More information about the mpich-discuss mailing list