[petsc-users] random SLEPc segfault using openmpi-3.0.1

Smith, Barry F. bsmith at mcs.anl.gov
Fri Oct 19 16:21:44 CDT 2018



> On Oct 19, 2018, at 2:08 PM, Moritz Cygorek <mcygorek at uottawa.ca> wrote:
> 
> Hi,
> 
> I'm using SLEPc to diagonalize a huge sparse matrix and I've encountered random segmentation faults. 

https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind


> 
> 
> I'm actually using a the slepc example 4 without modifications to rule out errors due to coding.
> Concretely, I use the command line 
> 
> ompirun -n 28 ex4 \
> -file amatrix.bin -eps_tol 1e-6 -eps_target 0 -eps_nev 18 \
> -eps_harmonic -eps_ncv 40 -eps_max_it 100000 \
> -eps_monitor -eps_view  -eps_view_values -eps_view_vectors 2>&1 |tee -a $LOGFILE
> 
> 
> 
> The program runs for some time (about half a day) and then stops with the error message
> 
> 
> [13]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> 
> There is definitely enough memory, because I'm using less than 4% of the available 128GB.
> 
> 
> 
> Since everything worked fine on a slower computer with a different setup and from previous mailing list comments, I have the feeling that this might be due to some issues with MPI.
> 
> Unfortunately, I have to share the computer with other people and can not uninstall the current MPI implementation and I've also heard that there are issues if you install more than one MPI implementation. 
> 
> For your information: I've configured PETSc with
> 
> ./configure  --with-mpi-dir=/home/applications/builds/intel_2018/openmpi-3.0.1/ --with-scalar-type=complex --download-mumps --download-scalapack --with-blas-lapack-dir=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
> 
> 
> 
> 
> I wanted to ask a few things:
> 
> - Is there a known issue with openmpi causing random segmentation faults?
> 
> - I've also tried to install everything needed by configuring PETSc with
> 
> ./configure \
> --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-scalar-type=complex \
> --download-mumps --download-scalapack --download-mpich --download-fblaslapack
> 
> Here, the problem is that performing the checks after "make" stops after the check with 1 MPI process, i.e., the check using 2 MPI just never finishes. 
> Is that a known issue of conflict between the downloaded mpich and the installed openmpi?
> Do you know a way to install mpich without conflicts with openmpi without actually removing openmpi?
> 
> 
> - Some time ago a posted a question in the mailing list about how to compile SLEPc/PETSc with OpenMP only instead of MPI. After some time, I was able to get MPI to work on a different computer, 
> but I was never really able to use OpenMP with slepc, but it would be very useful in the present situation. The programs compile but they never take more than 100% CPU load as displayed by top.
> The answers to my question contained the recommendations that I should configure with --download-openblas and have the OMP_NUM_THREADS variable set when executing the program. I did it, but it didn't help either.
> So, my question: has someone ever managed to find a configure line that disables MPI but enables the usage of OpenMP so that the slepc ex4 program uses significantly more than 100% CPU usage when executing the standard Krylov-Schur method?
>  
> 
> 
> 
> Regards,
> Moritz



More information about the petsc-users mailing list