[mpich-discuss] Hydra SIGTSTP (Ctrl+Z) handling

Pavan Balaji balaji at mcs.anl.gov
Tue Nov 30 19:18:59 CST 2010


On 11/30/2010 12:31 PM, Yauheni Zelenko wrote:
> However same problem exists for rsh.

Yes -- as I mentioned, we cannot fix this for all bootstrap servers, 
unless we change the design drastically in some way -- but it's not 
clear what that way will be. See explanation below.

> As far as I know (but I may be mistaken) Hydra communicates with
> proxies (which launch actual application processes) via sockets. I
> think it (at least theoretically) possible to catch signal in Hydra
> and send related information via socket. After receiving proxy will
> relay signals to application processes. In this case signal handling
> will be independent of actual launcher.

Correct. In fact, after 1.3.1 was released, I did rework STDOUT/STDERR 
to use the control socket without relying on the launcher. I'm currently 
working on doing the same with STDIN. But that'll still leave us with 
the problem with the launcher closing its sockets on a SIGTSTP signal -- 
we cannot stop this part. At the very least, all of the proxy debug and 
error messages will get lost because of these closed sockets.

Is there any chance of explaining to the application developers that 
catching SIGTSTP in an MPI application is a very bad idea and they 
shouldn't be doing that?

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list