[petsc-users] Strange behavior of MatLUFactorNumeric()

Jinquan Zhong jzhong at scsolutions.com
Tue Aug 14 18:56:35 CDT 2012


Barry,

I will install valgrind as one of my own packages on this machine and get back to you later.

Jinquan

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Tuesday, August 14, 2012 3:55 PM
To: PETSc users list
Subject: Re: [petsc-users] Strange behavior of MatLUFactorNumeric()


  Can you run with valgrind

http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind



On Aug 14, 2012, at 5:39 PM, Jinquan Zhong <jzhong at scsolutions.com> wrote:

> Thanks, Matt.
>  
> 1.       Yes, I have checked the returned values from x obtained from
> MatSolve(F,b,x)
> 
>                 The norm error check for x is complete for N=75, 2028.
> 
> 2.       Good point, Matt.  Here is the complete message for Rank 391.  The others are similar to this one.
>  
>  
> [391]PETSC ERROR: 
> ----------------------------------------------------------------------
> -- [391]PETSC ERROR: Caught signal number 11 SEGV: Segmentation 
> Violation, probably memory access out of range [391]PETSC ERROR: Try 
> option -start_in_debugger or -on_error_attach_debugger [391]PETSC 
> ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[391]PETSC 
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors [391]PETSC ERROR: likely location of 
> problem given in stack below [391]PETSC ERROR: ---------------------  
> Stack Frames ------------------------------------
> [391]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [391]PETSC ERROR:       INSTEAD the line number of the start of the function
> [391]PETSC ERROR:       is given.
> [391]PETSC ERROR: [391] MatLUFactorNumeric_SuperLU_DIST line 284 
> /nfs/06/com0488/programs/libraries/PETSc/petsc-3.3-p2/src/mat/impls/ai
> j/mpi/superlu_dist/superlu_dist.c [391]PETSC ERROR: [391] 
> MatLUFactorNumeric line 2778 
> /nfs/06/com0488/programs/libraries/PETSc/petsc-3.3-p2/src/mat/interfac
> e/matrix.c [391]PETSC ERROR: --------------------- Error Message 
> ------------------------------------
> [391]PETSC ERROR: Signal received!
> [391]PETSC ERROR: 
> ----------------------------------------------------------------------
> -- [391]PETSC ERROR: Petsc Release Version 3.3.0, Patch 2, Fri Jul 13 
> 15:42:00 CDT 2012 [391]PETSC ERROR: See docs/changes/index.html for 
> recent updates.
> [391]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [391]PETSC ERROR: See docs/index.html for manual pages.
> [391]PETSC ERROR: 
> ----------------------------------------------------------------------
> -- [391]PETSC ERROR: /nfs/06/com0488/programs/examples/ZSOL0.2431/ZSOL 
> on a arch-linu named n0272.ten.osc.edu by com0488 Sun Aug 12 23:18:07 
> 2012 [391]PETSC ERROR: Libraries linked from 
> /nfs/06/com0488/programs/libraries/PETSc/petsc-3.3-p2/arch-linux2-cxx-
> debug/lib [391]PETSC ERROR: Configure run at Fri Aug  3 17:44:00 2012 
> [391]PETSC ERROR: Configure options 
> --with-blas-lib=/nfs/06/com0488/programs/libraries/ScaLAPACK/2.0.1/lib
> /librefblas.a 
> --with-lapack-lib=/nfs/06/com0488/programs/libraries/ScaLAPACK/2.0.1/l
> ib/libreflapack.a --download-blacs --download-scalapack 
> --with-mpi-dir=/usr/local/mvapich2/1.7-gnu 
> --with-mpiexec=/usr/local/bin/mpiexec --with-scalar-type=complex 
> --with-precision=double --with-clanguage=cxx 
> --with-fortran-kernels=generic --download-mumps 
> --download-superlu_dist --download-parmetis --download-metis 
> --with-fortran-interfaces[391]PETSC ERROR: 
> ----------------------------------------------------------------------
> -- [391]PETSC ERROR: User provided function() line 0 in unknown 
> directory unknown file
> [cli_391]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 391
>  
>  
> From: petsc-users-bounces at mcs.anl.gov 
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Tuesday, August 14, 2012 3:34 PM
> To: PETSc users list
> Subject: Re: [petsc-users] Strange behavior of MatLUFactorNumeric()
>  
> On Tue, Aug 14, 2012 at 5:26 PM, Jinquan Zhong <jzhong at scsolutions.com> wrote:
> Dear PETSc folks,
>  
> I have a strange observation on using MatLUFactorNumeric() for dense matrices at different order N.  Here is the situation I have:
>  
> 1.       I use ./src/mat/tests/ex137.c as an example to direct PETSc in selecting superLU-dist and mumps.  The calling sequence is
> 
> MatGetOrdering(A,...)
> 
> MatGetFactor(A,...)
> 
> MatLUFactorSymbolic(F, A,...)
> 
> MatLUFactorNumeric(F, A,...)
> 
> MatSolve(F,b,x)
> 
> 2.       I have three dense matrices A at three different dimensions: N=75, 2028 and 21180. 
> 
> 3.       The calling sequence works for N=75 and 2028.  But when N=21180, the program hanged up when calling MatLUFactorNumeric(...).  Seemed to be a segmentation fault with the following error message:
> 
>  
> 
> [1]PETSC ERROR: --------------------- Error Message 
> ------------------------------------
> [1]PETSC ERROR: Signal received!
>  
> ALWAYS send the entire error message. How can we tell anything from a small snippet?
>  
> Since you have [1], this was run in parallel, so you need 3rd party 
> packages. But you do not seem to be checking return values. Check them 
> to make sure those packages are installed correctly.
>  
>    Matt
>  
> Does anybody have similar experience on that?
>  
> Thanks a lot!
>  
> Jinquan
> 
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener



More information about the petsc-users mailing list