[mpich-discuss] MPI_Comm_spawn(), dynamic distribution and hydra

Roberto Fichera kernel at tekno-soft.it
Thu Mar 3 02:47:22 CST 2011


On 03/02/2011 09:28 PM, Pavan Balaji wrote:

Ciao Pavan,

> Hi Roberto,
>
> Hydra doesn't currently have the capability to request for more resources from a resource manager. It can only query
> for the allocated resources.

So it means that at this time I have to use directly the resource manager API, right?

What about to perform the MPI boot process on freshly allocated nodes? Do you have
any suggestion how to make it?

>
> But if you ask Hydra to spawn a process on a particular node (using the info argument), it will do that.

This is the same approach I used for spawning using the info argument of the MPI_Comm_spawn()

>
>  -- Pavan
>
> On 03/02/2011 05:55 AM, Roberto Fichera wrote:
>> Hi All,
>>
>> I made a parallelization library on top of the MPICH2 library that basically
>> performs a dynamic job's distribution among all the assigned nodes. So
>> the library makes an heavy use of MPI_Comm_spawn() in a MPI_THREAD_MULTIPLE
>> configure MPICH2 library. This works pretty fine and we are able to easily differentiate
>> the spawned jobs algorithms once decomposing the data. Furthermore the library
>> permit the user to specify more than one node to assign to any given job. This end up
>> on "electing" the given slave node to become a master of a node subset (submastering)
>> to perform more fine grained parallel distribution as per user's algorithm. So
>> typical distribution scenery is like a tree where the nodes are submasters. Since we
>> discover that for certain algorithm we might need to dynamically decide the number
>> of nodes to dedicate for any given submastering parallel computation. So I started to
>> investigate if would be possible to make an interface with the cluster resource manager
>> and the library we use.
>>
>> So my idea is that in certain condition before to call the MPI_Comm_spawn() for spawning
>> a job into a well defined hostname I would like to reserve, via the resource manager, the
>> number of node requested by the given submastering job. So my first thought was to make
>> an interface against PBS (it's our cluster manager/scheduler) using its libpbs and basically
>> implementing something like a qsub/mpiexec function call, but I come at the conclusion
>> that for implementing the dynamic bootstrap process, for the newly allocated
>> node, Hydra would be much more flexible in such sense, since it seems providing a more
>> abstracted interface for doing such integration.
>>
>> So my question is: does anyone know a better way to do so? Might it create
>> problems or not I don't see in some way?
>>
>> Any suggestion will be really appreciated.
>>
>> Thanks in advance.
>> Roberto Fichera.
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



More information about the mpich-discuss mailing list