[petsc-users] superlu_dist+parmetis

Xiaoye S. Li xsli at lbl.gov
Fri Feb 15 09:01:19 CST 2019


I am pretty sure this is a bug in parmetis.  A few years ago, I nailed down
this "divide by zero" bug, and reported to them. They said they would take
a look, but never did.  This usually happens when the graph is relatively
dense.

You can try sequential Metis.

Sherry

On Thu, Feb 14, 2019 at 10:38 PM Smith, Barry F. via petsc-users <
petsc-users at mcs.anl.gov> wrote:

>
>
> > On Feb 15, 2019, at 12:30 AM, Marius Buerkle <mbuerkle at web.de> wrote:
> >
> > It works with all options for "-mat_superlu_dist_colperm" save parmetis.
> or do you mean to change some other option besides colperm?
>
>   No, I didn't have any other suggestions. It could be that this option
> just more easily introduces zero pivots or you could report it to Sherry
> Li, the main SuperLU_DIST developer and see what she says.
>
>    Barry
>
> >
> >> Gesendet: Freitag, 15. Februar 2019 um 04:16 Uhr
> >> Von: "Smith, Barry F." <bsmith at mcs.anl.gov>
> >> An: "Marius Buerkle" <mbuerkle at web.de>
> >> Cc: "PETSc users list" <petsc-users at mcs.anl.gov>
> >> Betreff: Re: [petsc-users] superlu_dist+parmetis
> >>
> >>
> >>  Given the message the likely cause is SuperLU_DIST got a zero pivot on
> process 6. Presumably the colperm parmetis is one induced the zero pivot.
> >>
> >>  Have you tried other superlu_dist options to find others that do not
> cause this?
> >>
> >>   Barry
> >>
> >>
> >>> On Feb 14, 2019, at 6:44 PM, Marius Buerkle via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >>>
> >>> Dear PETSc team,
> >>>
> >>> I try to run superlu_dist+parmetis with " -mat_superlu_dist_colperm
> parmetis -mat_superlu_dist_parsymbfact" which gives me the following error
> >>>
> >>> [6]PETSC ERROR: Caught signal number 8 FPE: Floating Point
> Exception,probably divide by zero
> >>> [6]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >>> [6]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >>> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >>> [651]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >>> [653]PETSC ERROR:
> ------------------------------------------------------------------------
> >>> [653]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
> the batch system) has told this process to end
> >>> [653]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >>> [653]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >>> [653]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
> Mac OS X to find memory corruption errors
> >>> [653]PETSC ERROR: likely location of problem given in stack below
> >>> [653]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >>> [653]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >>> [653]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >>> [653]PETSC ERROR:       is given.
> >>> [653]PETSC ERROR: [653] SuperLU_DIST:pzgssvx line 465
> /home/cdfmat_marius/prog/petsc/git/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >>> [653]PETSC ERROR: [653] MatLUFactorNumeric_SuperLU_DIST line 314
> /home/cdfmat_marius/prog/petsc/git/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >>> [653]PETSC ERROR: [653] MatLUFactorNumeric line 3124
> /home/cdfmat_marius/prog/petsc/git/petsc/src/mat/interface/matrix.c
> >>> [653]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >>> [653]PETSC ERROR: Signal received
> >>> [653]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> >>> [653]PETSC ERROR: Petsc Development GIT revision:
> v3.10.3-980-g66b342c  GIT Date: 2018-12-26 13:49:21 -0600
> >>> [653]PETSC ERROR:
> /home/cdfmat_marius/prog/transomat_latest_openmpi4.0/transomat on a  named
> h023 by cdfmat_marius Wed Feb 13 23:58:21 2019
> >>>
> >>> Any idea?
> >>>
> >>> best,
> >>> marius
> >>
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190215/f21dfb89/attachment.html>


More information about the petsc-users mailing list