[mpich-discuss] Suspend jobs that use MPICH2 with Hydra

Shan-ho Tsai shtsai at uga.edu
Fri May 4 12:54:00 CDT 2012


Hello all,
We have mpich2 1.4.1p1 installed on a RHEL5 cluster
and sometimes have the need to suspend all jobs clusterwide.

Is there a way to suspend MPICH2 jobs that use Hydra, in 
such a way that the master process and all slave process
(on multiple nodes) get properly suspended? 

If there is a way to do this, what is the procedure? Is there
a signal that we could send to mpiexec? 

I tried sending a SIGSTOP to mpiexec, but only mpiexec
got suspended, the actual a.out processes continued to run.

I really appreciate any suggestions.
thank you,
Shan-Ho

----------------------------------------------------
Shan-Ho Tsai
University of Georgia, Athens GA





More information about the mpich-discuss mailing list