[MPICH2-dev] mpd client library and protocol?
Rusty Lusk
lusk at mcs.anl.gov
Thu May 18 13:12:01 CDT 2006
Hi Greg,
Now that I see what you need, I have a different answer. At least
most of what you need has been implemented and documented as part of the
mpd package, which includes other commands besides mpiexec, such as
mpdlistjobs, mpdsigjob, and mpdkilljob. Other parts may be available as
effects of doing something to the mpiexec process, like suspending it,
continuing it, or killing it. (These were originally implemented for
the purpose of interactive control, but should do what you want if you
deliver the appropriate signals to mpiexec.) For at least one level of
documentation, once you have installed mpd, you can do
mpdhelp
to get a list of the mpd commands, and then
<mpdcmd> --help
to get a description of how to use each command. More information is in
the MPICH2 Installer's Guide and User's Guide, in the doc subdirectory
of mpich2.
What these commands do (including mpiexec) is contact the locally
running mpd and talk to it via messages consisting of python
dictionaries. Yes, you could write your own program to generate these
messages, but I would hope that we have already implemented much of what
you need, and we would be interested in implementing the rest of what
you need in collaboration with you.
Regards,
Rusty
P.S. I am about to invite you to a workshop at Oak Ridge on July 12-14
at Oak Ridge. Mark your calendar.. :-)
From: Greg Watson <gwatson at lanl.gov>
Subject: Re: [MPICH2-dev] mpd client library and protocol?
Date: Thu, 18 May 2006 11:15:30 -0600
> Rajeev,
>
> Sorry for not being clearer. Yes, I need to be able to control an
> MPI program, not implement MPI or a process manager. I'm exploring
> the possibility of using Eclipse (via the Parallel Tools Platform) to
> manage the launch and control of MPI programs using MPICH2. The
> architecture requires an interface between PTP (Java) and the runtime
> system (in this case the MPICH2 process manager) that supports a few
> basic commands (including RUN, TERMINATE, GETJOBS, GETPROCESSES,
> etc.) and responds to certain events, such as process termination.
>
> Unless you can suggest a better approach, I'm thinking of writing a
> python program that will provide this interface to mpd. I'd prefer to
> do it in Java or C as I'll have to re-implement a bunch of stuff in
> python, but because of the way you serialize python objects it
> doesn't look possible the use a non-python program to communicate
> with mpd.
>
> If you have any documentation that would assist, it would be
> appreciated.
>
> Regards,
>
> Greg
>
> On May 17, 2006, at 7:46 PM, Rajeev Thakur wrote:
>
> > Greg,
> > What exactly are you trying to do? That might help us figure
> > out what
> > might be your best option. I am attaching the document describing
> > the PMI
> > interface. We use use PMI to *implement* MPI. You seem to want to
> > control
> > MPI program. Is that right?
> >
> > Rajeev
> >
> >
> >
> > On Wed, 17 May 2006, Greg Watson wrote:
> >
> >> Bill,
> >>
> >> I'm not sure, since I still don't really understand the architecture.
> >> Can I use PMI to launch and control an MPI program on a cluster? Or
> >> is that something that will be available in the future? I would
> >> rather not have to provide a different program for each process
> >> manager, but cluster support is also essential.
> >>
> >> Any information or documentation you can provide on the architecture
> >> and APIs would be appreciated.
> >>
> >> Greg
> >>
> >>
> >> On May 16, 2006, at 10:21 PM, William Gropp wrote:
> >>
> >>> At 11:15 PM 5/16/2006, Greg Watson wrote:
> >>>> Rajeev,
> >>>>
> >>>> Many thanks for your reply. Can you suggest the best approach if I
> >>>> want to write a C program to control mpd? At a minimum, I'd like to
> >>>> be able to spawn/terminate an MPI job using a C program. Is PMI
> >>>> what
> >>>> I'd use to do this?
> >>>>
> >>>> Any documentation you could provide would be appreciated.
> >>>
> >>> An alternative is to not use MPD at all and to use the PMI
> >>> interface. A C example of this is the "gforker" process manager;
> >>> this is built using a set of utility routines in mpich2/src/pm/util
> >>> that provide the "other" side of the simple PMI interface. gforker
> >>> implements all of the PM functions, including spawning MPI jobs.
> >>> Let me know if this is the direction in which you are interested.
> >>>
> >>> Bill
> >>>
> >>>
> >>>> Thanks,
> >>>>
> >>>> Greg
> >>>>
> >>>> On May 16, 2006, at 7:56 PM, Rajeev Thakur wrote:
> >>>>
> >>>>> Greg,
> >>>>>
> >>>>>> I assume that mpdlib.py is a client library that other
> >>>>>> applications
> >>>>>> (i.e. other than mpiexec) could potentially use to communicate
> >>>>>> with
> >>>>>> and/or control mpd.
> >>>>>>
> >>>>>> 1. Is there any API documentation?
> >>>>>
> >>>>> The API is the Process Manager Interface (PMI), which is the
> >>>>> interface
> >>>>> MPICH2 uses for interacting with process managers. There is some
> >>>>> documentation for it, which I could send you if you like (it may
> >>>>> not be 100%
> >>>>> up to date).
> >>>>>
> >>>>>> 2. Is there a C version of the client library?
> >>>>>
> >>>>> The PMI library is in C. It is implemented in src/pmi/simple/
> >>>>> simple_pmi.c.
> >>>>>
> >>>>>> 3. Is the mpd protocol documented anywhere?
> >>>>>
> >>>>> Not currently, but the plan is to :-).
> >>>>>
> >>>>>> 4. Is the protocol used by mpd the same as that used by smpd?
> >>>>>
> >>>>> No, they are different.
> >>>>>
> >>>>> Rajeev
> >>>
> >>> William Gropp
> >>> http://www.mcs.anl.gov/~gropp
> >>
> >> <paper.pdf>
>
More information about the mpich2-dev
mailing list