[mpich-discuss] best/fastest way to get a node communicator
Jeff Hammond
jhammond at alcf.anl.gov
Tue Jan 17 16:23:54 CST 2012
Thanks for the helpful suggestions, Rhys.
Using a perfect hash function was the first thing to jump into my
head. Unfortunately, I do not know how to write perfect hash
functions for MPI_MAX_PROCESSOR_NAME-length character arrays and the
value of optimization for memory usage on clusters, which almost
always have fewer than 20000 cores and more than 20 GB of DRAM, isn't
worth me learning how to use gperf. The memory use required for the
Gather is an issue on a Blue Gene or a Cray, but like I said, I have
another solution there.
Whenever MPI_Get_processor_name returns an IP address, I don't need a
hash function because I can just convert the IP address into an
integer (32-bit for IPv4 and 128-bit for IPv6) and use that as the
key. A straightforward optimization is to test if all nodes have
returned an IP address and if true, use that, otherwise fall back to
the gather+qsort implementation.
Jeff
On Tue, Jan 17, 2012 at 3:39 PM, Rhys Ulerich <rhys.ulerich at gmail.com> wrote:
>> On 01/17/2012 01:24 PM, Jeff Hammond wrote:
>>> I'm interested in being able to create a communicator for each node.
>>> I have custom APIs for this on Blue Gene and Cray, but that doesn't
>>> help on clusters.
>>>
>>> I was thinking of doing the following:
>>> - call gethostname on each rank
>>> - gather these values to root
>>> - sort the array and assign a different color number for each unique
>>> value in a new array
>>> - scatter the color array and call comm_split
>>>
>>> Does anyone know of a better/faster way to do this?
>
> This may avoid the memory overhead on root and avoids the sort.
>
> - call MPI_Get_processor_name as Jim Dinan suggested
> - hash the processor name into a nonnegative int color (with care as
> to the chosen hash function)
> - MPI_Comm_split on the color
> - Set value '1' into some integer buffer on all ranks.
> - MPI_Allreduce MPI_SUM on each integer buffer to find the number of
> ranks in each rank's color-communicator
> - MPI_Allreduce to find the minimum and maximum of the ranks in each
> color-communicator across COMM_WORLD
> - If minimum and maximum are not identical, throw away the bad
> node-specific communicators (you encountered a hash collision), add
> some salt to the hash (maybe an iteration number), and repeat the
> process.
>
> Presumably you won't be hit by hash collisions indefinitely. With a
> good hash function, you probably won't hit any collisions at all.
>
> It is likely not faster (as it is a bit chatty) but I've never
> measured it. Nor implemented it.
>
> - Rhys
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/old/index.php/User:Jhammond
More information about the mpich-discuss
mailing list