[mpich-discuss] Hydra SIGTSTP (Ctrl+Z) handling
Pavan Balaji
balaji at mcs.anl.gov
Mon Nov 29 19:14:47 CST 2010
On 11/29/2010 06:15 PM, Yauheni Zelenko wrote:
> Hydra have inconsistency in SIGTSTP handling (Ctrl-Z). It's works
> when processes started on same host as mpiexec and not working when
> mpiexec start processes remotely.
Not all signals are supported by Hydra. SIGTSTP is one of the
unsupported signals. The reason you are seeing inconsistent behavior is
because the signal handling depends on the launcher. When you launch
locally, the "fork" launcher is used, and when you launch remotely, the
"ssh" launcher is used. SSH goes all crazy when it sees an SIGTSTP and
that is outside Hydra's control (it cannot stop other processes from
calling a signal handler).
If you are looking to checkpoint the running MPI application, you
should: (1) configure MPICH2 with checkpointing support, and (2) send
the SIGUSR1 signal to mpiexec (or give the -ckpoint-interval option to
mpiexec).
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list