[petsc-users] How to understand these error messages

Fande Kong fd.kong at siat.ac.cn
Wed Oct 23 18:22:20 CDT 2013


Jed,

Thank you very much.

They made some observations, and they might make some progresses.  I at
least can make some runs now. They also say that it is something about
ordering/rendezvous. They said that there may be too many messages or too
long messages or both.




On Wed, Oct 23, 2013 at 4:22 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:

> Fande Kong <fd.kong at siat.ac.cn> writes:
>
> > Hi Barry,
> >
> > I contacted the supercomputer center, and they asked me for a test case
> so
> > that they can forward it to IBM. Is it possible that we write a test case
> > only using MPI?  It is not a good idea that we send them the whole petsc
> > code and my code.
>
> This may be possible, but this smells of a message ordering/rendezvous
> problem, in which case you'll have to reproduce essentially the same
> communication pattern.  The fact that you don't see the error sooner in
> your program execution (and that it doesn't affect lots of other people)
> indicates that the bug may be very specific to your communication
> pattern.  In fact, it is likely that changing your element distribution
> algorithm, or some similar changes, may make the bug vanish.
> Reproducing all this matching context in a stand-alone code is likely to
> be quite a lot of effort.
>
> I would configure the system to dump core on errors and package up the
> test case.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131023/4065764f/attachment.html>


More information about the petsc-users mailing list