[mpich-discuss] Hydra handling of non-zero exit codes (1.3.2, 1.4rc2)
Yauheni Zelenko
zelenko at cadence.com
Thu Apr 28 17:16:57 CDT 2011
Hi, Pavan!
Thank you for help!
Eugene.
________________________________________
From: Pavan Balaji [balaji at mcs.anl.gov]
Sent: Thursday, April 28, 2011 3:11 PM
To: mpich-discuss at mcs.anl.gov
Cc: Yauheni Zelenko
Subject: Re: [mpich-discuss] Hydra handling of non-zero exit codes (1.3.2, 1.4rc2)
If a process terminates with a non-zero return code, Hydra cleans up the
remaining processes. Not doing this is bad, because it might cause the
application to hang. You can disable automatic cleanup by passing the
-disable-auto-cleanup option. I think this is what you are looking for.
The return code of mpiexec is a bit-wise OR of all the process exit
codes, so if all processes return the same exit code, mpiexec will
return the same exit code as well.
-- Pavan
On 04/28/2011 05:02 PM, Yauheni Zelenko wrote:
> Hi!
>
> Our application could return non-zero exit codes as flag to launching script to make some further post-processing.
>
> Hydra prints "BAD TERMINATION OF ONE OF YOUR PROCESSES".
>
> I think will be good idea to add command line option to Hydra to allow non-zero exit codes and don't change them if all of them are same from all MPI processes.
>
> Problem may be reproduced with any MPICH2 example by returning non-zero from main().
>
> I also think will be good idea to print exit codes in Hydra verbose output.
>
> Eugene.
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list