[petsc-users] [SLEPc] Krylov-Schur convergence

Fabian.Jakub Fabian.Jakub at physik.uni-muenchen.de
Fri Nov 16 09:52:20 CST 2018


Concerning your gdb attaching error/missing symbols etc, you might have
the issue with ptrace_scope as it is suggested in your petsc output.

<https://rajeeshknambiar.wordpress.com/2015/07/16/attaching-debugger-and-ptrace_scope/>

If you have the possibility to become root on your system you can try
again after disabling the security feature.

echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope


On 11/16/18 9:38 AM, Ale Foggia via petsc-users wrote:
>>
>> One thing you can do is use a symmetric matrix format: -mat_type sbaij
>> In this way, your matrix will always be symmetric because only the upper
>> triangular part is stored. The drawback is that efficiency will likely
>> decrease.
>>
>> To check symmetry, one (possibly bad) way is to take a bunch of random
>> vectors X and check that X'*A*X is symmetric. This can be easily done with
>> SLEPc's BVMatProject, see test9.c under
>> $SLEPC_DIR/src/sys/classes/bv/examples/tests
>>
>> Jose
>>
>>
> Is it possible that the asymmetry arises because of a numerical issue? It
> starts happening when my numbers are bigger than 2**32. I've checked that
> I'm always using PetscInt and I compiled the library with "64 bit
> integers". It seems strange to me that it only happens after some point and
> that the asymmetry is bigger than 10**-2.
> 
> 
>>> Looking at the failure in the debugger would really help us, for example
>> with a stack trace.
>>>
>>>   Thanks,
>>>
>>>     Matt
>>
> 
> I'm trying to use the debugger but I'm having problems with missing symbols
> and libraries, I send you the output at the end of the run, maybe it helps,
> meanwhile I'll keep trying to get the debugger stack trace. The error that
> stops the program occurs when I run it in separate nodes, because when I
> run the code in the login node it does not give any problem.
> 
> [296]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal
> memory access
> slurmstepd: error: Detected 14 oom-kill event(s) in step 2596337.0 cgroup.
> Some of your processes may have been killed by the cgroup out-of-memory
> handler.
> [296]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [296]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [296]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> [296]PETSC ERROR: likely location of problem given in stack below
> [296]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> Fatal error in PMPI_Waitall: Other MPI error, error stack:
> PMPI_Waitall(405)...............: MPI_Waitall(count=319,
> req_array=0x1f44940, status_array=0x1f3a9c0) failed
> MPIR_Waitall_impl(221)..........: fail failed
> PMPIDI_CH3I_Progress(623).......: fail failed
> pkt_RTS_handler(317)............: fail failed
> do_cts(662).....................: fail failed
> MPID_nem_lmt_dcp_start_recv(302): fail failed
> dcp_recv(165)...................: Internal MPI error!  Cannot read from
> remote process
>  Two workarounds have been identified for this issue:
>  1) Enable ptrace for non-root users with:
>     echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
>  2) Or, use:
>     I_MPI_SHM_LMT=shm
> 
> [296]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> [296]PETSC ERROR:       INSTEAD the line number of the start of the function
> [296]PETSC ERROR:       is given.
> [296]PETSC ERROR: [296] MatCreateSubMatrices_MPIAIJ_Local line 2100
> /opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiov.c
> [296]PETSC ERROR: [296] MatCreateSubMatrices_MPIAIJ line 1977
> /opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiov.c
> [296]PETSC ERROR: [296] MatCreateSubMatrices line 6693
> /opt/lib/petsc-3.9.3/src/mat/interface/matrix.c
> [296]PETSC ERROR: [296] MatIsTranspose_MPIAIJ line 1019
> /opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiaij.c
> [296]PETSC ERROR: [296] MatIsSymmetric_MPIAIJ line 1053
> /opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiaij.c
> [296]PETSC ERROR: [296] MatIsSymmetric line 8461
> /opt/lib/petsc-3.9.3/src/mat/interface/matrix.c
> [296]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> 



More information about the petsc-users mailing list