[Swift-devel] [swift-bugs] [Bug 1014] mpi jobs fail on stampede

Michael Wilde wilde at mcs.anl.gov
Fri Jun 7 21:48:52 CDT 2013


This really looks like the mpich2 fd0 problem.
Please try the " mpiexec < /dev/null fix first. This is well tested.

On 6/7/13, bugzilla-daemon at mcs.anl.gov <bugzilla-daemon at mcs.anl.gov> wrote:
> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=1014
>
>
>
>
>
> --- Comment #7 from Mihael Hategan <hategan at mcs.anl.gov>  2013-06-07
> 21:04:08 ---
> A google search suggests that the HYD messages are cleanup messages and
> happen
> after the user app exits.
>
> So I'd check if this is not a path issue (put
> /home1/01739/ketan/bin/mpitest.sh
> in the slurm script).
>
> I would also put some debugging lines in mpitest.sh (e.g. touch
> /shared/fs/path/mpitest_was_here) to see if it's started by swift.
>
> --
> Configure bugmail:
> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are watching all bug changes.
> _______________________________________________
> swift-bugs mailing list
> swift-bugs at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-bugs
>

-- 
Sent from my mobile device



More information about the Swift-devel mailing list