[petsc-users] Hang at PetscLayoutSetUp()

Derek Gaston friedmud at gmail.com
Mon Feb 6 23:34:05 CST 2012


On Mon, Feb 6, 2012 at 10:27 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:

>
> Are _all_ the processes making it here?
>

Sigh.  I knew someone was going to ask that ;-)

I'll have to write a short script to grab the stack trace from every one of
the 10,000 processes to see where they are and try to find any anomalies.
Anyone have a script (or pieces of one) to do this that they wouldn't mind
sharing?

I did spot check quite a few and they were all in the same spot.

Now here comes the weirdness: I left one of these processes attached in GDB
for quite a while (10+ minutes) after the whole job had been hung for over
an hour.  When I noticed that I had left it attached I detached GDB and....
the job started right up!  That is: it moved on past this problem!  How is
that for some weirdness.  It might have just been coincidence... or maybe
me stalling that process for a bit by attaching GDB nudged some
communication in the right direction... I don't know.

I know that's not terribly scientific.  I'll have to wait until the next
job hangs before I can do more inspection, but when (not if) that happens
I'll post back with more info.

Derek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120206/404760a3/attachment.htm>


More information about the petsc-users mailing list