[petsc-users] [SLEPc] Krylov-Schur convergence
Ale Foggia
amfoggia at gmail.com
Fri Nov 16 02:38:41 CST 2018
>
> One thing you can do is use a symmetric matrix format: -mat_type sbaij
> In this way, your matrix will always be symmetric because only the upper
> triangular part is stored. The drawback is that efficiency will likely
> decrease.
>
> To check symmetry, one (possibly bad) way is to take a bunch of random
> vectors X and check that X'*A*X is symmetric. This can be easily done with
> SLEPc's BVMatProject, see test9.c under
> $SLEPC_DIR/src/sys/classes/bv/examples/tests
>
> Jose
>
>
Is it possible that the asymmetry arises because of a numerical issue? It
starts happening when my numbers are bigger than 2**32. I've checked that
I'm always using PetscInt and I compiled the library with "64 bit
integers". It seems strange to me that it only happens after some point and
that the asymmetry is bigger than 10**-2.
> > Looking at the failure in the debugger would really help us, for example
> with a stack trace.
> >
> > Thanks,
> >
> > Matt
>
I'm trying to use the debugger but I'm having problems with missing symbols
and libraries, I send you the output at the end of the run, maybe it helps,
meanwhile I'll keep trying to get the debugger stack trace. The error that
stops the program occurs when I run it in separate nodes, because when I
run the code in the login node it does not give any problem.
[296]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal
memory access
slurmstepd: error: Detected 14 oom-kill event(s) in step 2596337.0 cgroup.
Some of your processes may have been killed by the cgroup out-of-memory
handler.
[296]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[296]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[296]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
X to find memory corruption errors
[296]PETSC ERROR: likely location of problem given in stack below
[296]PETSC ERROR: --------------------- Stack Frames
------------------------------------
Fatal error in PMPI_Waitall: Other MPI error, error stack:
PMPI_Waitall(405)...............: MPI_Waitall(count=319,
req_array=0x1f44940, status_array=0x1f3a9c0) failed
MPIR_Waitall_impl(221)..........: fail failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(302): fail failed
dcp_recv(165)...................: Internal MPI error! Cannot read from
remote process
Two workarounds have been identified for this issue:
1) Enable ptrace for non-root users with:
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
2) Or, use:
I_MPI_SHM_LMT=shm
[296]PETSC ERROR: Note: The EXACT line numbers in the stack are not
available,
[296]PETSC ERROR: INSTEAD the line number of the start of the function
[296]PETSC ERROR: is given.
[296]PETSC ERROR: [296] MatCreateSubMatrices_MPIAIJ_Local line 2100
/opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiov.c
[296]PETSC ERROR: [296] MatCreateSubMatrices_MPIAIJ line 1977
/opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiov.c
[296]PETSC ERROR: [296] MatCreateSubMatrices line 6693
/opt/lib/petsc-3.9.3/src/mat/interface/matrix.c
[296]PETSC ERROR: [296] MatIsTranspose_MPIAIJ line 1019
/opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiaij.c
[296]PETSC ERROR: [296] MatIsSymmetric_MPIAIJ line 1053
/opt/lib/petsc-3.9.3/src/mat/impls/aij/mpi/mpiaij.c
[296]PETSC ERROR: [296] MatIsSymmetric line 8461
/opt/lib/petsc-3.9.3/src/mat/interface/matrix.c
[296]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181116/634f2c0b/attachment.html>
More information about the petsc-users
mailing list