[MPICH2-dev] Eclipse PTP support for MPICH2

Thu Jun 8 11:46:25 CDT 2006

Rusty,

Here's what I'd like to be able to do:

mpiexec -np n+1 my_debugger target_program

Then my_debugger, which is an MPI program, does something like:

int main(int argc, char *argv[])
{
	MPI_Init();

	if (rank < n) {
		setenv('PMI_RANK, rank);
		setenv('PMI_SIZE', n);
		setenv('PMI_PORT', ...whatever);

		fork_and_exec_under_debugger_control(argv[1]);

		slave_debugger();
	} else {
		main_debugger();
	}

	MPI_Finalize();
}

Notice that the debugger starts one more process than the target  
program being debugged. The idea is that the environment variables  
that are set prior to the fork_and_exec...() will allow the target  
program to complete it's own MPI_Init(). The debugger and target  
program communicators should be distinct as well.

Is this the kind of thing that Cobalt could provide?

I don't think the '-gdb' flag is going to help, since this only  
starts the target program under the control of separate instances of  
gdb, but doesn't provide any way of getting individual control of  
those gdb's from another (MPI) program.

Regards,

Greg

On Jun 7, 2006, at 4:12 PM, Rusty Lusk wrote:

> Hi Greg,
>
>      I am glad this is working for you so far.  We would like to help
> with it in any way we can.  With regard to your specific questions  
> below:
>
> a) Our component-based resource management system (Cobalt) does  
> separate
> allocation and launch, if I understand what you mean correctly.  In
> fact, so far you have only seen the "launch" part, in mpd.  We have a
> completely separate scheduler that you would use for allocation, and
> feed results from it to mpd (either directly, through mpiexec, or
> through an XML file) to launch jobs.  Your "proxy" essentially  
> becomes a
> peer component in Cobalt and can do all sorts of things.
>
> b) Yes, environment variables are used to enable MPI_Init to work,  
> along
> with information that comes through the PMI interface.  In fact, if  
> you
> run "printenv" with MPD you can see what the environment looks like:
>
> shakey% mpiexec -genvnone -n 1 printenv
> PMI_RANK=0
> PMI_SIZE=1
> PMI_PORT=140.221.9.72:44587
> PMI_SPAWNED=0
> PMI_DEBUG=0
> MPICH_INTERFACE_HOSTNAME=140.221.9.72
> PATH=/sandbox/lusk/jul6-install-mpd/bin:/usr/local/X11R5/bin:.:/mcs/ 
> bin:/usr/local/bin:/homes/lusk/bin:.
> PMI_TOTALVIEW=0
>
> Again if I understand you correctly, the mpd infrastructure either
> already has what you need or can be tweaked to provide it, and we  
> would
> like to help.
>
> Also, have you looked at mpd's approach to debugging, the -gdb  
> option on
> mpiexec?  (formerly known as mpigdb).  It has some features I like,  
> and
> is an example of slipping a debugger in between the mpd managers  
> and the
> application programs.
>
> I hope the above is helpful.
>
> Regards,
> Rusty
>
>
>
> From: Greg Watson <gwatson at lanl.gov>
> Subject: [MPICH2-dev] Eclipse PTP support for MPICH2
> Date: Wed, 7 Jun 2006 11:56:49 -0600
>
>> Ok, I've got a prototype working against mpd. I've written a proxy
>> that lets Eclipse/PTP query the mpd for available hosts and load
>> these into it's model so you get a view of the cluster. You can then
>> launch an MPI job, see the job status reflected in the views, see
>> stdout from each process, and terminate a job. The only assumption is
>> that the mpd's have already been started before you launch Eclipse.
>>
>> I'd like to try and get the debugger working next. The debugger is an
>> MPI program itself and the current (non-attach) debug startup works
>> as follows:
>>
>> 1. Request a process allocation for the debugger (n+1 procs) and
>> obtain a job id from the runtime.
>>
>> 2. Request a process allocation for the target program (n procs) and
>> obtain a job id from the runtime.
>>
>> 3. Using the debugger job id, request the runtime launch the  
>> debugger.
>>
>> 4. Once the debugger has started (MPI_Init has completed), set the
>> following in the environment of each process:
>>
>> - the job id of the target allocation
>> - the task id for each process in the target allocation
>> - the total number of processes in the target allocation
>>
>> 5. The debugger then forks/execs the target executable, which is now
>> under it's control.
>>
>> 6. The target eventually calls MPI_Init and completes the
>> initialization using the information from the environment variables.
>>
>> This process requires support from the runtime, and assumes that: (a)
>> the runtime supports the separation of allocation and launch, and (b)
>> that MPI initialization can be completed using values taken from the
>> environment.
>>
>> Would anyone be able to comment if (a) is currently feasible with mpd
>> (maybe requiring modification) and if (b) is supported by PMI and
>> what environment variables are necessary?
>>
>> Thanks,
>>
>> Greg
>>
>>
>> On May 18, 2006, at 12:12 PM, Rusty Lusk wrote:
>>
>>> Hi Greg,
>>>
>>>      Now that I see what you need, I have a different answer.  At
>>> least
>>> most of what you need has been implemented and documented as part
>>> of the
>>> mpd package, which includes other commands besides mpiexec, such as
>>> mpdlistjobs, mpdsigjob, and mpdkilljob.  Other parts may be
>>> available as
>>> effects of doing something to the mpiexec process, like  
>>> suspending it,
>>> continuing it, or killing it.  (These were originally implemented  
>>> for
>>> the purpose of interactive control, but should do what you want  
>>> if you
>>> deliver the appropriate signals to mpiexec.)  For at least one
>>> level of
>>> documentation, once you have installed mpd, you can do
>>>
>>>     mpdhelp
>>>
>>> to get a list of the mpd commands, and then
>>>
>>>     <mpdcmd> --help
>>>
>>> to get a description of how to use each command.  More information
>>> is in
>>> the MPICH2 Installer's Guide and User's Guide, in the doc  
>>> subdirectory
>>> of mpich2.
>>>
>>> What these commands do (including mpiexec) is contact the locally
>>> running mpd and talk to it via messages consisting of python
>>> dictionaries.  Yes, you could write your own program to generate  
>>> these
>>> messages, but I would hope that we have already implemented much of
>>> what
>>> you need, and we would be interested in implementing the rest of  
>>> what
>>> you need in collaboration with you.
>>>
>>> Regards,
>>> Rusty
>>>
>>> P.S.  I am about to invite you to a workshop at Oak Ridge on July
>>> 12-14
>>> at Oak Ridge.  Mark your calendar.. :-)
>>>
>>>
>>>
>>>
>>>
>>> From: Greg Watson <gwatson at lanl.gov>
>>> Subject: Re: [MPICH2-dev] mpd client library and protocol?
>>> Date: Thu, 18 May 2006 11:15:30 -0600
>>>
>>>> Rajeev,
>>>>
>>>> Sorry for not being clearer.  Yes, I need to be able to control an
>>>> MPI program, not implement MPI or a process manager. I'm exploring
>>>> the possibility of using Eclipse (via the Parallel Tools  
>>>> Platform) to
>>>> manage the launch and control of MPI programs using MPICH2. The
>>>> architecture requires an interface between PTP (Java) and the  
>>>> runtime
>>>> system (in this case the MPICH2 process manager) that supports a  
>>>> few
>>>> basic commands (including RUN, TERMINATE, GETJOBS, GETPROCESSES,
>>>> etc.) and responds to certain events, such as process termination.
>>>>
>>>> Unless you can suggest a better approach, I'm thinking of writing a
>>>> python program that will provide this interface to mpd. I'd  
>>>> prefer to
>>>> do it in Java or C as I'll have to re-implement a bunch of stuff in
>>>> python, but because of the way you serialize python objects it
>>>> doesn't look possible the use a non-python program to communicate
>>>> with mpd.
>>>>
>>>> If you have any documentation that would assist, it would be
>>>> appreciated.
>>>>
>>>> Regards,
>>>>
>>>> Greg
>>>>
>>>> On May 17, 2006, at 7:46 PM, Rajeev Thakur wrote:
>>>>
>>>>> Greg,
>>>>>      What exactly are you trying to do? That might help us figure
>>>>> out what
>>>>> might be your best option. I am attaching the document describing
>>>>> the PMI
>>>>> interface. We use use PMI to *implement* MPI. You seem to want to
>>>>> control
>>>>> MPI program. Is that right?
>>>>>
>>>>> Rajeev
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 17 May 2006, Greg Watson wrote:
>>>>>
>>>>>> Bill,
>>>>>>
>>>>>> I'm not sure, since I still don't really understand the
>>>>>> architecture.
>>>>>> Can I use PMI to launch and control an MPI program on a  
>>>>>> cluster? Or
>>>>>> is that something that will be available in the future? I would
>>>>>> rather not have to provide a different program for each process
>>>>>> manager, but cluster support is also essential.
>>>>>>
>>>>>> Any information or documentation you can provide on the
>>>>>> architecture
>>>>>> and APIs would be appreciated.
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>> On May 16, 2006, at 10:21 PM, William Gropp wrote:
>>>>>>
>>>>>>> At 11:15 PM 5/16/2006, Greg Watson wrote:
>>>>>>>> Rajeev,
>>>>>>>>
>>>>>>>> Many thanks for your reply. Can you suggest the best approach
>>>>>>>> if I
>>>>>>>> want to write a C program to control mpd? At a minimum, I'd
>>>>>>>> like to
>>>>>>>> be able to spawn/terminate an MPI job using a C program. Is PMI
>>>>>>>> what
>>>>>>>> I'd use to do this?
>>>>>>>>
>>>>>>>> Any documentation you could provide would be appreciated.
>>>>>>>
>>>>>>> An alternative is to not use MPD at all and to use the PMI
>>>>>>> interface.  A C example of this is the "gforker" process  
>>>>>>> manager;
>>>>>>> this is built using a set of utility routines in mpich2/src/pm/
>>>>>>> util
>>>>>>> that provide the "other" side of the simple PMI interface.
>>>>>>> gforker
>>>>>>> implements all of the PM functions, including spawning MPI jobs.
>>>>>>> Let me know if this is the direction in which you are  
>>>>>>> interested.
>>>>>>>
>>>>>>> Bill
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Greg
>>>>>>>>
>>>>>>>> On May 16, 2006, at 7:56 PM, Rajeev Thakur wrote:
>>>>>>>>
>>>>>>>>> Greg,
>>>>>>>>>
>>>>>>>>>> I assume that mpdlib.py is a client library that other
>>>>>>>>>> applications
>>>>>>>>>> (i.e. other than mpiexec) could potentially use to  
>>>>>>>>>> communicate
>>>>>>>>>> with
>>>>>>>>>> and/or control mpd.
>>>>>>>>>>
>>>>>>>>>> 1. Is there any API documentation?
>>>>>>>>>
>>>>>>>>> The API is the Process Manager Interface (PMI), which is the
>>>>>>>>> interface
>>>>>>>>> MPICH2 uses for interacting with process managers. There is  
>>>>>>>>> some
>>>>>>>>> documentation for it, which I could send you if you like  
>>>>>>>>> (it may
>>>>>>>>> not be 100%
>>>>>>>>> up to date).
>>>>>>>>>
>>>>>>>>>> 2. Is there a C version of the client library?
>>>>>>>>>
>>>>>>>>> The PMI library is in C. It is implemented in src/pmi/simple/
>>>>>>>>> simple_pmi.c.
>>>>>>>>>
>>>>>>>>>> 3. Is the mpd protocol documented anywhere?
>>>>>>>>>
>>>>>>>>> Not currently, but the plan is to :-).
>>>>>>>>>
>>>>>>>>>> 4. Is the protocol used by mpd the same as that used by smpd?
>>>>>>>>>
>>>>>>>>> No, they are different.
>>>>>>>>>
>>>>>>>>> Rajeev
>>>>>>>
>>>>>>> William Gropp
>>>>>>> http://www.mcs.anl.gov/~gropp
>>>>>>
>>>>>> <paper.pdf>
>>>>
>>