[petsc-users] Automatically re-solving after MUMPS error

Matt Landreman matt.landreman at gmail.com
Wed Sep 30 16:10:38 CDT 2015


Hi Barry,
I tried adding PetscMallocDump after SNESDestroy as you suggested. When
mumps fails, PetscMallocDump shows a number of mallocs which are absent
when mumps succeeds, the largest being MatConvertToTriples_mpiaij_mpiaij()
(line 638 in petsc-3.6.0/src/mat/impls/aij/mpi/mumps/mumps.c).  The total
memory reported by PetscMallocDump after SNESDestroy is substantially
(>20x) larger when mumps fails than when mumps succeeds, and this amount
increases uniformly with each mumps failure.  So I think some of the
mumps-related structures are not being deallocated by SNESDestroy if mumps
generates an error.
Thanks,
-Matt

On Wed, Sep 30, 2015 at 2:16 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Sep 30, 2015, at 1:06 PM, Matt Landreman <matt.landreman at gmail.com>
> wrote:
> >
> > PETSc developers,
> >
> > I tried implementing a system for automatically increasing MUMPS
> ICNTL(14), along the lines described in this recent thread. If SNESSolve
> returns ierr .ne. 0 due to MUMPS error -9, I call SNESDestroy,
> re-initialize SNES, call MatMumpsSetIcntl with a larger value of ICNTL(14),
> call SNESSolve again, and repeat as needed. The procedure works, but the
> peak memory required (as measured by the HPC system) is 50%-100% higher if
> the MUMPS solve has to be repeated compared to when MUMPS works on the 1st
> try (by starting with a large ICNTL(14)), even though SNESDestroy is called
> in between the attempts. Are there some PETSc or MUMPS structures which
> would not be deallocated immediately by SNESDestroy?  If so, how do I
> deallocate them?
>
>    They should be all destroyed automatically for you. You can use
> PetscMallocDump() after the SNES is destroyed to verify that all that
> memory is not properly freed.
>
>    My guess is that your new malloc() with the bigger workspace cannot
> "reuse" the space that was previously freed; so to the OS it looks like you
> are using a lot more space but in terms of physical memory you are not
> using more.
>
>   Barry
>
> >
> > Thanks,
> > Matt Landreman
> >
> >
> > On Tue, Sep 15, 2015 at 7:47 AM, David Knezevic <
> david.knezevic at akselos.com> wrote:
> > On Tue, Sep 15, 2015 at 7:29 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, Sep 15, 2015 at 4:30 AM, David Knezevic <
> david.knezevic at akselos.com> wrote:
> > In some cases, I get MUMPS error -9, i.e.:
> > [2]PETSC ERROR: Error reported by MUMPS in numerical factorization
> phase: INFO(1)=-9, INFO(2)=98927
> >
> > This is easily fixed by re-running the executable with
> -mat_mumps_icntl_14 on the commandline.
> >
> > However, I would like to update my code in order to do this
> automatically, i.e. detect the -9 error and re-run with the appropriate
> option. Is there a recommended way to do this? It seems to me that I could
> do this with a PETSc error handler (e.g. PetscPushErrorHandler) in order to
> call a function that sets the appropriate option and solves again, is that
> right? Are there any examples that illustrate this type of thing?
> >
> > I would not use the error handler. I would just check the ierr return
> code from the solver. I think you need the
> > INFO output, for which you can use MatMumpsGetInfo().
> >
> >
> > OK, that sounds good (and much simpler than what I had in mind), thanks
> for the help!
> >
> > David
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150930/953bb945/attachment.html>


More information about the petsc-users mailing list