[mpich-discuss] Some questions on mpdman and sock channel

Rajeev Thakur thakur at mcs.anl.gov
Wed Mar 12 22:04:36 CDT 2008


> I'm a newbie to MPICH2 and have been trying to get familiar 
> with the code.  I've got some questions and am hoping some 
> kind soul will enlighten me.
> 
> I'm working off the stable version 1.0.6p1 on Linux.
> 
> MPDMAN:
> 
> 1.  I'm confused about why there is a sock pair between MPD and
> MPDMAN.   What is this used for?

mpdman manages a single job. That connection can be used to do request info
about the current ring size, etc. and to request services for the job that
the manager can not perform itself.

> 2.  There seems to be some relationship between MPDMAN and the ring 
> formed by the MPDs, but I am unclear what's going on.  Is the MPDMAN 
> somehow inserting itself into the MPD ring?  If so, is it taking the 
> lhs of its parent MPD and connecting it as its rhs; what happens to 
> its lhs?  It seems as though MPDMAN can send messages on the MPD ring?

The mpdman does not enter the mpd ring. The managers for a single job form
their own ring, connected back to the mpiexec process to pass back stdout,
etc.

>
> 3.  What's the intended meaning of entry_ifhn/entry_port?  What's 
> connected to the two ends?  I see that MPDMAN's lhsIfhn and lhsPort 
> are set to these, but why?

Entry items typically are related to where a process enters the current
ring.

>
> 4.  Why is there a self.ring instance inside of MPDMAN?

Because mpdman belongs to a ring for the job.

> Sock channel:
> 
> I'm fundamentally confused about how MPI communication calls 
> go through the channel abstraction.  My understanding is that 
> an MPI program on one host can talk directly to another MPI 
> program on another host.  But if the MPI program only has a 
> notion of "rank", how does this get translated into IP 
> address/port in the sock channel?
> There must be some translation table somewhere, but I can't find it.

Each process publishes its "business card" using PMI_Put. The business card
contains its IP address and port. Another process can obtain the business
card by calling PMI_Get. (PMI stands for process manager interface). The
process manager maintains a key-val cache that is accessed with PMI_Put/Get.


Rajeev




More information about the mpich-discuss mailing list