[mpich-discuss] communicator creation/deletion semantics and performance
Jim Dinan
dinan at mcs.anl.gov
Tue Oct 30 13:14:59 CDT 2012
The paper that Jeff cited presents an implementation of
MPI_Comm_create_group that is built on top of MPI-2 and uses recursive
intercommunicator merging:
https://www.mcs.anl.gov/publications/paper_detail.php?id=1695
At this year's EuroMPI, we presented a paper on the native MPI-3
implementation of MPI_Comm_create_group and compared its cost with
MPI_Comm_create and the technique presented in the above paper at
EuroMPI '11. The general takeaway is that MPI_Comm_create_group should
always be the cheaper:
https://www.mcs.anl.gov/publications/paper_detail.php?id=2061
~Jim.
On 10/30/12 12:48 PM, Jeff Hammond wrote:
> If you want to dynamically generate communicators on a group without
> using it being collective on the parent communication (which might be
> MPI_COMM_WORLD in the worst case), see
> http://www.mcs.anl.gov/publications/paper_detail.php?id=1695. I
> suspect this will be useful for doing the dynamic unions you describe
> below. This operation is part of MPI-3 (see MPI_Comm_create_group)
> and was implemented in MPICH2 a while ago. I don't know the optimized
> (i.e. internal) implementation works on BGQ but I suspect it can be
> done relatively easily in the event that the on-top-of-MPI
> implementation isn't fast enough (it is quite fast - faster than
> MPI_Comm_create - for small groups, at least on BGP (see paper for
> details)).
>
> I believe that MPI_Comm_split does communication, potentially
> expensive communication (allgather? or maybe that was the optimized
> version...). At some point, I recall Bill Gropp and coworkers doing
> some work to optimize it because of a problem we observed at scale on
> BGP. I don't know the status of that and whether or not it is part of
> MPICH2 yet. As for MPI_Comm_free, I suspect it is quite cheap and
> does at most a barrier, but I am speculating.
>
> I can hack an optimized version of MPI_Comm_create_group into BGQ-MPI
> using PAMI if IBM doesn't do it first.
> PAMI_Geometry_create_endpointlist is collective only over the output
> geometry, which matches the semantics of MPI_Comm_create_group.
>
> Is this at all helpful?
>
> Jeff
>
> On Tue, Oct 30, 2012 at 11:33 AM, Edgar Solomonik
> <solomon at eecs.berkeley.edu> wrote:
>> Hello,
>>
>> For my application, I need to maintain or dynamically create a large number
>> of communicators. My current solution has been to initialize a large number
>> of communicators at start-up and make dynamic decisions on which to use
>> later. I have ran into MPI errors due to creating too many communicators on
>> some occasions, but have so far been able to resolve this by limiting the
>> set.
>>
>> However, I am now interested in employing an even larger set of
>> communicators, that is harder to generate completely. So, I would like to
>> move to an approach which dynamically creates and frees communicators on
>> demand. I am concerned about two issues:
>>
>> 1. Is there an overhead to MPI_Comm_split and MPI_Comm_free, for instance do
>> they need to perform inter-process communication?
>> 2. Does the limit on the number of communicators bound the number of
>> communicators ever created or the number of live (non-freed) communicators?
>>
>> My specific use-case is merging sets of communicators in dynamic ways. e.g.
>> on BG/Q I form up 6 communicators for each dimension (+1 for intra-node) and
>> then make dynamic mapping decisions which select unions of the communicators
>> to map to. So, I either need to construct a fairly complicated tree
>> data-structure to keep up with all possible unions of communicators or I can
>> simply create the unions on demand and free them once I am done using them
>> after a given iteration. So, far I had used only contiguous unions of
>> communicators, which is a smaller set and easier to keep track of in a flat
>> data-structure, but I want even more generality now.
>>
>> Thanks,
>> Edgar
>>
>> _______________________________________________
>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>
>
>
More information about the mpich-discuss
mailing list