[mpich-discuss] hydra and SIGCONT

Ashley Pittman ashley at pittman.co.uk
Fri Dec 4 04:23:21 CST 2009


On Mon, 2009-11-30 at 13:35 -0600, Pavan Balaji wrote:
> Hi Ashley,
> 
> On 11/30/2009 11:37 AM, Ashley Pittman wrote:
> > In looking at getting hydra working with padb I've discovered that
> > mpiexec.hydra exits abnormally if you send it SIGCONT.  The default
> > action for this signal is to ignore it however hydra exits claiming to
> > have been killed by SIGTERM.
> 
> Good catch. At one point, we were planning on suspend/restart code for
> the mpiexec process itself, but later gave up on it because other tools
> used within hydra (such as ssh) are bad at handling them anyway. This is
> a remnant of that code. I'll clean it up.

Thanks, I've just seen the commit fly past.

Looking at the code I'm not sure the SIGSTOP is trapable, it's one of
the ones that the application doesn't get any choice about IIRC so you
could probably remove that one as well.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk



More information about the mpich-discuss mailing list