[mpich-discuss] Suspend jobs that use MPICH2 with Hydra

Ju JiaJia jujj603 at gmail.com
Fri May 4 20:37:10 CDT 2012


I think you can use a resource manager and scheduler to do this, like
torque + maui. You can suspend and resume jobs.

On Sat, May 5, 2012 at 8:46 AM, Pavan Balaji <balaji at mcs.anl.gov> wrote:

> Hello,
>
> We don't support this right now.  I've created a ticket for it.
>
> https://trac.mcs.anl.gov/**projects/mpich2/ticket/1627<https://trac.mcs.anl.gov/projects/mpich2/ticket/1627>
>
> Please add yourself to the cc list of this ticket, if you'd like to be
> informed about updates on this issue.
>
>  -- Pavan
>
>
> On 05/04/2012 12:54 PM, Shan-ho Tsai wrote:
>
>> Hello all,
>> We have mpich2 1.4.1p1 installed on a RHEL5 cluster
>> and sometimes have the need to suspend all jobs clusterwide.
>>
>> Is there a way to suspend MPICH2 jobs that use Hydra, in
>> such a way that the master process and all slave process
>> (on multiple nodes) get properly suspended?
>>
>> If there is a way to do this, what is the procedure? Is there
>> a signal that we could send to mpiexec?
>>
>> I tried sending a SIGSTOP to mpiexec, but only mpiexec
>> got suspended, the actual a.out processes continued to run.
>>
>> I really appreciate any suggestions.
>> thank you,
>> Shan-Ho
>>
>> ------------------------------**----------------------
>> Shan-Ho Tsai
>> University of Georgia, Athens GA
>>
>>
>>
>> ______________________________**_________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
> ______________________________**_________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120505/46cef4e4/attachment.htm>


More information about the mpich-discuss mailing list