[petsc-users] Superlu_dist error

Matthew Knepley knepley at gmail.com
Sat Oct 5 22:55:24 CDT 2013


On Sat, Oct 5, 2013 at 10:49 PM, Jose David Bermeol <jbermeol at purdue.edu>wrote:

> Hi I'm runnig petsc trying to solve a linear system with superlu_dist.
> However i have a memory violation, atached is the code, and here is the
> output. Email me if you need something else to figured out what is
> happening.
>

So it looks like SuperLU_Dist is bombing during an LAPACK operation. It
could be an MKL problem, or a SuperLU_Dist problem, or our problem,
or a mismatch between versions. I would try to simplify the configuration
in order to cut down on the possibilities. Eliminate everything that is not
necessary for SuperLU_dist first. Then change to --download-f-blas-lapack.
If you still have a crash, send us the matrix since that should be
reproducible and we can report a SuperLU_dist bug or fix our code.

  Thanks,

      Matt


> Thanks
>
> mpiexec -n 2 ./test_solver -mat_superlu_dist_statprint
> -mat_superlu_dist_matinput distributed
>         Nonzeros in L       10
>         Nonzeros in U       10
>         nonzeros in L+U     10
>         nonzeros in LSUB    10
>         NUMfact space (MB) sum(procs):  L\U     0.00    all     0.03
>         Total highmark (MB):  All       0.03    Avg     0.02    Max
> 0.02
>         Mat conversion(PETSc->SuperLU_DIST) time (max/min/avg):
>                               0.000146866 / 0.000145912 / 0.000146389
>         EQUIL time             0.00
>         ROWPERM time           0.00
>         COLPERM time           0.00
>         SYMBFACT time          0.00
>         DISTRIBUTE time        0.00
>         FACTOR time            0.00
>         Factor flops    1.000000e+02    Mflops      0.31
>         SOLVE time             0.00
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSCERROR: or try
> http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> [0]PETSC ERROR: [1]PETSC ERROR:       INSTEAD the line number of the start
> of the function
> [1]PETSC ERROR:       is given.
> [1]PETSC ERROR: [1] SuperLU_DIST:pzgssvx line 234
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [1]PETSC ERROR: [1] MatMatSolve_SuperLU_DIST line 198
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] SuperLU_DIST:pzgssvx line 234
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [0]PETSC ERROR: [1] MatMatSolve line 3207
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c
> [1]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [1]PETSC ERROR: [0] MatMatSolve_SuperLU_DIST line 198
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [0]PETSC ERROR: [0] MatMatSolve line 3207
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c
> Signal received!
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Petsc Release Version 3.4.2, Jul, 02, 2013
> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [1]PETSC ERROR: See docs/index.html for manual pages.
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: ./test_solver on a linux-complex named
> carter-fe02.rcac.purdue.edu by jbermeol Sat Oct  5 23:45:21 2013
> [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Libraries linked from
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/linux-complex/lib
> [1]PETSC ERROR: Configure run at Sat Oct  5 11:19:36 2013
> [1]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc
> --with-fc=mpiifort --with-scalar-type=complex --with-shared-libraries=1
> --with-debugging=1 --with-pic=1 --with-clanguage=C++ --with-fortran=1
> --with-fortran-kernels=0
> --with-blas-lapack-dir=/apps/rhel6/intel/composer_xe_2013.3.163/mkl
> --with-blacs-lib=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
> --with-blacs-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-scalapack-lib="-L/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64
> -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64"
> --with-scalapack-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1 --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3
> --with-mkl-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-mkl-lib="[/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_intel_lp64.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_intel_thread.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_core.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/../compiler/lib/intel64/libiomp5.so]"
> --with-cpardiso-dir=/home/jbermeol/testPetscSolvers/intel_mkl_cpardiso
> --with-hdf5 --download-hdf5=yes --download-metis=yes
> --download-parmetis=yes --download-superlu_dist=yes --download-superlu=yes
> --download-mumps=yes --download-spooles=yes --download-pastix=yes
> --download-ptscotch=yes --download-umfpack=yes --download-sowing
> Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.2, Jul, 02, 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./test_solver on a linux-complex named
> carter-fe02.rcac.purdue.edu by jbermeol Sat Oct  5 23:45:21 2013
> [0]PETSC ERROR: Libraries linked from
> /home/jbermeol/Nemo5/libs/petsc/build-cplx/linux-complex/lib
> [0]PETSC ERROR: Configure run at Sat Oct  5 11:19:36 2013
> [0]PETSC ERROR: application called MPI_Abort(MPI_COMM_WORLD, 59) - process
> 1
> Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort
> --with-scalar-type=complex --with-shared-libraries=1 --with-debugging=1
> --with-pic=1 --with-clanguage=C++ --with-fortran=1 --with-fortran-kernels=0
> --with-blas-lapack-dir=/apps/rhel6/intel/composer_xe_2013.3.163/mkl
> --with-blacs-lib=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
> --with-blacs-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-scalapack-lib="-L/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64
> -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64"
> --with-scalapack-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1 --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3
> --with-mkl-include=/apps/rhel6/intel/composer_xe_2013.3.163/mkl/include
> --with-mkl-lib="[/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_intel_lp64.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_intel_thread.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/lib/intel64/libmkl_core.so,/apps/rhel6/intel/composer_xe_2013.3.163/mkl/../compiler/lib/intel64/libiomp5.so]"
> --with-cpardiso-dir=/home/jbermeol/testPetscSolvers/intel_mkl_cpardiso
> --with-hdf5 --download-hdf5=yes --download-metis=yes
> --download-parmetis=yes --download-superlu_dist=yes --download-superlu=yes
> --download-mumps=yes --download-spooles=yes --download-pastix=yes
> --download-ptscotch=yes --download-umfpack=yes --download-sowing
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131005/56d26ec6/attachment.html>


More information about the petsc-users mailing list