[petsc-users] Spectrum slicing with MUMPS (Segmentation fault)

Jose E. Roman jroman at dsic.upv.es
Tue Jul 19 09:42:23 CDT 2016


SuperLU_dist can be used in general with shift-and-invert, but for spectrum slicint (eps_interval) it does not work because it does not provide inertia (MatGetInertia) which is required in that case.

Jose


> El 19 jul 2016, a las 16:38, Matthew Knepley <knepley at gmail.com> escribió:
> 
> On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla <mhassan at miners.utep.edu> wrote:
> Hi all,
> 
> I have been trying spectrum slicing with MUMPS external solver. The error output is the following:
> 
> A stack trace in the debugger would help, but it sounds like an error in MUMPS. You can try SuperLU_dist instead.
> 
>   Thanks,
> 
>     Matt 
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015
> [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54
> :00 2016
> [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-do
> uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp
> i-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --kn
> own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8
>  --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec
> t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-
> fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --
> with-dependencies=0 --with-dependencies=0 --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 --with-superlu-incl
> ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a
> --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu_dist-lib=/opt/cray
> /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy
> bridge/include --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a --with-metis=1 --with-metis-include=/o
> pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a --with-pts
> cotch=1 --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1
> 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/
> INTEL/140/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --wit
> h-mumps=1 --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s
> andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA
> GS="-xavx -openmp -O3  " --FFLAGS="-xavx -openmp -O3  " --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra
> y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi
> th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/
> INTEL/140/sandybridge/include --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib -lsundials_cvode -lsundials_cvodes
>  -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> srun: error: nid00281: task 0: Aborted
> srun: Terminating job step 1330433.0
> slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT 2016-07-19T02:54:04 ***
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> srun: error: nid00281: tasks 1-17: Killed
> srun: error: nid00282: tasks 18-35: Killed
> 
> 
> 
> I ran the same code in my pc with 8 processor. It had no issues. But when I tried in a different machine, I am getting this. Any idea? Can I use Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in another run. 
> 
> 
> 
> Thanks,
> 
> 
> 
> M Hassan
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener



More information about the petsc-users mailing list